I've run across the term "double-blind studies" in reference to medical research. In the operations and research that I've done myself, I've made use of "blind testing" as needed. It is widely considered both in medicine and in my own field of chemistry to be the most accurate way to get results uncontaminated by our own wishful thinking.
I wrote earlier about the placebo effect, and blinding the studies is probably the best way to counter it.
In chemistry, we really only need single blinding: the person running the lab tests doesn't know what the sample is supposed to be: a sample, a duplicate, a standard, or a blank. To do this, I hand over a set of sample bottles with nothing but code numbers written on them and tell them to test the lot for a particular set of compounds. A chemical reaction is a chemical reaction; if the same sample doesn't react the same way to the same test, it means somebody did something wrong somewhere along the way. In medicine, it's not so easy because there are patients involved, and their reactions (chemical, biological, and psychological) are all slightly different, and some of them will get better on their own no matter what is given to them.
I've mostly used blinded tests in chemistry for quality control: specifically, is the lab technician and the lab equipment accurate (reading zero on the blanks and reading the concentration of the standard on the standard sample) and are they consistent (getting the same value on two identical samples presented in different bottles).
In medicine, it seems it's not enough that the patient not be told if they're getting a medicine or a placebo, because humans are so incredibly good at unconsciously reading body language. Apparently a doctor who knows whether he's handing a patient a test medicine or a placebo gives off signals that the patient can read.
To handle this, a third party has to be involved to keep track of which patient gets which medicine, and that third person can't see the patients and can only hand over the entire rack of both placebo and trial medicine to the doctor so the doctor doesn't get any unconscious feedback on which sort of medicine is in hand, and thus can't pass it on to the patient. This, in a nutshell, is double-blinding, and is why they're the gold standard for medical trials.
The placebo effect of doctors knowing what they're giving the patient was explicitly studied—by telling the doctors there was a supply problem with one of the three medicines being used in a trial that was testing one medicine that reduced pain, one that increased pain, and a placebo. There was no supply problem, but in the part of the trial where the doctors thought the actual active painkiller wasn't one of the options, the patients who received the placebo didn't show a placebo pain reduction but instead showed an increase in pain. Even though the doctors didn't tell the patients about the "supply problem"! In the part of the trial where the doctors thought the active painkiller was available, patients who received the placebo did show some placebo pain reduction. (In both groups, there were patients who got the real painkiller, and it reduced their pain for real. In both groups, the doctors didn't know what they were actually giving the patient. All they knew was that they'd been told that of the three possibilities, the one that was known to reduce pain was having those "supply problems".)
Imagine how strong that effect would be if the doctor giving over the medicine (or placebo) actually knew which one they were handing over.
In chemistry, we need to know if our lab results show the reality, not what we want them to show. In medicine, we need to know if the medicine actually helps people get better faster than they would on their own with no medicine.
There's a little bit less room for errors of expectation in chemistry, especially with modern equipment that displays the result to three significant figures with little room for a technician to wonder if that indicator dye changed colour on the previous drop or the next one, or to try and decide what that last sig fig is when the reading is between two index marks. Self-reported pain, for example if testing a painkiller, is far more difficult to put on a scale. There is a scale from "no pain" to "the worst pain I've ever felt"—and that second one depends a lot on how the patient has hurt themselves in the past.
In spite of that, I have fallen prey to wishful thinking during analysis of good hard numbers, and re-analyzing data six months later I wonder where I got the conclusion I did when analyzing it the day I did the tests—when I was in the middle of really, really wanting an answer to the problem I was trying to solve. What do you do when the numbers show one thing but the analysis is biased?