> 432 Hz tuned music was associated with a slight decrease of mean (systolic and diastolic) blood pressure values (although not significant),
"Not significant" means that "a slight decrease" is not supported by the data in the conventionally accepted way. By convention, authors should not make claims either way if the effect is not statistically significant, even if they believe the effect is real. So the sentence above is declaring that the authors aren't following the norms of scientific claims.
But they have significant decrease in heartbeat - about 5 beats per minute.
For me, the difference between 57 and 62 beats per minute (with all other conditions being more or less equal) is a difference between me being healthy and having a light cold or enterovirus. E.g., it is a difference of "everything alright" and "some suspicious metabolic activity is there".
"Significant" does not mean "big" or "important" in this context. It means "not explained by chance more than 1 time out of 20 (p=0.05) if we repeated this experiment, given our data and the assumptions in our statistical model".
really? I think you are being extremely unfavorable to the authors here. They are simply stating the results on blood pressure went down but in a non-significant way. There is nothing wrong with that and also they are not making claims either way they are just reporting the results of the study and literally describing what happened. Please enlighten with what claims they are allegedly making? that the blood pressure decreased in a non-significant way?
That's a bit of an oversimplification that leaves out important qualifiers. "Non-significant" usually means it's indistinguishable from a null hypothesis when a certain level of randomness is allowed in a single, isolated trial.
Who picks the null hypothesis? How is it picked, i.e. why does that specific hypothesis get favourable treatment? What level of randomness should one allow? What does it even mean for an experiment to be a single, isolated trial? How can anything be?
Those are critical questions to understand the concept, and your explanation just pretends they don't exist.
That's another fallacy of frequentist reasoning, that we have to draw definitive conclusions from evidence. That something is definitely false until we have "statistical significance" where it all of a sudden becomes definitely true.
In real life, to borrow your description, we can hold varying levels of belief in statements depending on how strong the evidence is, and the magnitude of the payoff in the various cases.
Maybe the probability of the result in the study in question is 51 %. That's still more than 50 %. Whether that difference is meaningful to you is not something someone else can decide.
Nobody who knows what they are doing, and uses statistics, can flip from something being definitely true to definitely false. At best, they can find overwhelmingly convincing probabilities close to 0 or 1.
Honest scientists who use statistics do not make such a claim that an effect does not exist. Rather than the experiment that was conducted did not produce sufficient evidence (to a numerically defined standard) which justifies believing in the effect.
That is to say, that the existence of the effect, given the results of the experiment, has a low likelihood, and that low likelihood can be statistically quantified.
What that means is that exactly the same results as were observed will, or would, with a high probability, also be observed if the experiment occurs in the null hypothesis universe: the world in which the effect is absent.
So even if we are not in that universe (the effect is real), the experiment didn't show it.
The experiment simply doesn't discriminate between the null hypothesis and its negation to a level that could convince one to hold a probabilistic belief in the existence of the effect.
> the existence of the effect, given the results of the experiment, has a low likelihood, and that low likelihood can be statistically quantified
You have this completely backwards. It means that the likelihood of the null hypothesis was not below some threshold such that it can be "ruled out". It says absolutely nothing about the likelihood of the data if the effect exists.
Of course, but the fact that people apply a binary threshold tells you that they want to be able to rule out some things from their models entirely, and include other things as something that's as good as a true fact.
What does a non-binary threshold look like, and how is it different from just fine-tuning a regular binary threshold to err more or less on the side of caution?
It's not about a non-binary threshold. The problem is having a threshold in the first place.
Say that given the evidence there's a 9 % chance the null hypothesis is true. A frequentist used to a 10 % significance level would then say the effect is true. A frequentist trained on a 5 % significance level would say it is false.
But that's just an arbitrary cutoff that by itself means nothing.
If instead we look at a practical scenario where we would use this result, we understand the problem space better. Maybe we have figured out how to get limited rights to the transpositions of famous music to other A4s, and this would cost a lot to do, but earn us some money if we do it and the effect is real.
Should we acquire those rights or not? Ask the 5 % frequentist and they would say "there's no significant difference, so you shouldn't." Ask the 10 % frequent and they say the opposite. Who do we listen to?
Let's ask the poker player. They will ask "What exactly does it cost to do this, and how much will you earn if it works out? That matters!"
So let's say it costs us $100 per song to get the rights to the transposed version, and we think it will earn us $102 per song if the effect is true. Now we can just plug in, remembering that there's a 9 % chance the null hypothesis is correct:
-100 + 102 * 0.91 = -7.18 per song
Not good. What if we made $127 per song with the same cost?
-100 + 127 * 0.91 = +15.57 per song
Worth doing, at the same probability of the null hypothesis!
In other words, you can't determine what's significant until you know how you will use the result.
Statistics by itself is meaningless. It gains meaning only when it's used to choose between actions.
That's a binary decision, which is bad, so we shouldn't.
In accordance with the confidence probability, we should buy into a percentage of the rights, and transpose the music to an interpolated value between A=432 and A=440 Hz.
"Not significant" means that the probability is >=5% their result was obtained by chance.
We've settled as a community on a convention that we don't claim an effect is real until it is supported by data ("statistically significant") ie. <5% likely to be explained by chance in your results.
"Significant" does not mean big or important in this context. It means better than 5% unlikely to be (un)lucky data.
The threshold for significance lies in the eye of the beholder. A particle physicist might not be satisfied with anything over 0.01 %. A social scientist might be happy to see 10 %.
The 5 % number you mention is completely arbitrary and often woefully inappropriate.
Look at it from a betting perspective. Can you earn more than 10 × your investment if the null hypothesis is false? Then anything less likely than 10 % is significant.
It's a convention for scientific reporting. Your trades are not bound by this convention.
The parameter value is not arbitrary. It's a convention arrived at after hundreds of years. If it were arbitrary, p=0.999 or p=0.00001 would be just as good. We've settled on p=0.05 being usefully convincing but not crazy demanding to obtain by experiment with noisy measurements.
Null hypothesis testing was invented less than 100 years ago by Fisher, who completely arbitrarily picked 0.05 [0]. That value was not arrived at through wisdom of experience, and certainly not after hundreds of years of practice.
Though it has now indeed become conventional to test with p=0.05, there is nothing wrong with reporting an effect that fails the null hypothesis test. At least that is the position of the American Statistical Association [1].
Thanks for these refs. I read [1] carefully and I take your point that it’s ok to strictly report whatever the data says.
On the value itself, we are quibbling about the meaning of ‘arbitrary’: Fisher certainly could have chosen another value, but not all values would be considered useful. Some expertise about the nature of real world data and the minds of statisticians is encoded in the chosen value.
If I propose that we change the convention to use 1e-12 instead and you think ‘that’s too small, I prefer it the way it is’, then it’s not arbitrary in the sense I mean.
The thing you seem to be missing is that there's no one number that's a meaningful limit for all purposes.
What probability you accept as significant should depend entirely on how you plan to use the results. Something with a p value of a staggering 70 % (i.e. it's more likely not true than true) is significant if the payoff is good when it's true, and the cost is small when it's not true.
And 70 % is very far from 5 %!
Then again, if the payoff is tiny compared to the cost, you might ask for a p-value of less than 0.01 %, in order for it to make sense to take the chance on it.
Think like a poker player: a hand that has 1/4 chance of winning needs better than a 3-to-1 payout when it wins to be playable. Conversely, when the pot offers you a 3-to-1 payout, you better make sure your hand has more than a 1/4 chance of winning.
They didn't claim it was real did they? Just read out the result which was lower but not in a significant way. I've read hundreds of papers do the same.
By convention, this means "indistinguishable from", so reporting that it is lower is an unsupported claim. They would be equally justified in reporting that it was higher, ie. not at all.
It was lower though, just not significantly so (depending on your threshold for significance) - that's the standard way of reporting it. You can't just chunk part of the sentence and take issue with it, the sentence in it's entirety is accurate.
But if "result [was] lower but not in a significant way" means "result was not proven to be lower", how does saying "lower but not really lower" make ever any sense? It seems to me that such nonsensical formulation ought not to be ever used by anyone.
Because significance thresholds can vary pretty dramatically. Plenty of experiments done in physics for instance have reported results even though they didn't yet reach a 5 sigma threshold (3x10e-7). In physics something can be highly highly likely but still not 'significant' enough to warrant a discovery. They simply couch it as, hey this was the result and even though it isn't 'significant' the high likelihood may warrant additional research here. Reporting a binary significant/not significant is far less useful.
> "Not significant" means that the probability is >=5% their result was obtained by chance.
Ackchually... p-value represents the probability that results like these would be observed even if there was no difference between the two choices, simply due to chance.
> The study results suggest repeating the experiment with a larger sample pool and introducing randomized controlled trials covering more clinical parameters.
You're writing like you disagree, but in the sentence after that the actual "claim" says to do exactly what you suggest(get more data).
"Was associated with a slight decrease of mean" is a claim. The p=0.05 convention requires that you report that the mean was not found to be different from the comparison distribution.
Some of the wording in the article is weird, and the "Context" gives very little of the relevant context.
> The current reference frequency for tuning musical instruments is 440 Hz
More accurately, the current most popular frequency for note A4 (the A above middle C) is 440 Hz. The article brings up the 432 Hz tuning but completely fails to mention that it isn't just an arbitrarily-selected frequency, but a frequency proposed by Verdi. He also proposed Scientific Pitch, which is tuning middle C to 256 Hz, making all C notes integer frequencies, and taking A4 to about 430.54 Hz.
I'm a fan of scientific pitch, and I tend to just use it as a default when I'm doing any audio programming because it makes it very easy to calculate all the pitches using C-4 as a base with frequency of 1 Hz.
You're right. I'm not used to these sorts of sites and I thought that might be the case, but I couldn't find the link to find the PDF to access it, because it was part of the header bar (and becomes a floating header). I just tuned that out as irrelevant because I'm so used to ignoring floating headers when I'm not trying to navigate a site, and was looking everywhere in the center column, somewhere underneath the title for the link, didn't find it, and assumed that this was the entire article. That said, wow, this wants $36 just to read this single article, or I need to be part of an institution that gives me free access. That's steep. I'd like to read the entire article, but not given that price. Really hurts the accessibility of good information to have it that exclusionary.
Fascinating. Does this resolve the thirds of triads being slightly out of tune? Or any of the weird wonkeyness that equal-temperament tuning has spawned?
Nope; scientific pitch is entirely an equal-temperament tuning. The only thing it does is makes it a bit easier to remember and calculate specific pitches.
It depends, but A4=440Hz seems to be the standard assumed by just about every electronic instrument I've encountered (and while some fancier ones make that configurable, most don't).
Realistically, the "correct" reference pitch for a given performance is going to depend on environmental conditions, since temperature and humidity will affect the "normal" pitch for an instrument (and will do so in different ways depending on the type of instrument). Most pragmatic approach would be to base it on the hardest-to-retune instrument in a given ensemble (i.e. take a tuner to it while it's playing some octave of A, do the requisite multiplication/division to get to A4, and there you have your reference pitch). Speaking from experience in marching music, the resulting reference pitch tended to vary between 438Hz and 442Hz, with very rare exceptions.
Very off topic from the original article, but the entire history of why we even used 440hz at all as opposed to 432, 455 (Beethoven), 415 (Baroque era), and everything else everywhere else is so fascinating to me because at a certain point you are entire semitones above or below what we grow up listening to as “in tune.”
I’ve seen some interesting stuff (mostly Jacob Collier) where the tuning standard is actively manipulated during a song. Definitely convinces me we have not yet even begun to touch the boundaries of what’s possible in music.
> Definitely convinces me we have not yet even begun to touch the boundaries of what’s possible in music.
Yeah, but we're drifting in the opposite direction. Grid, 4/4, 440Hz and 12ET are the game if you publish anything these days in the western world and it's probably harder to do something different than it would be centuries ago.
I can't wait to hear something popular in 13ET, but probably won't happen in my lifespan.
In this vein, there's a pretty good playlist on Spotify called "Microtonal Bangers". It includes, among other things, psychedelic rock (KGLW), a blend of traditional Mauritanian music with pop music (Noura Mint Seymali), prog rock (Brendan Byrnes), jazz influenced by traditional Egyptian music (Ibrahim Maalouf), and electronic music (Sevish and Sungazer).
While this is just an initial exploration and the authors admit as much. I can't resist pointing out the multitude of problems.
First the most nitpicky but easiest to solve: They tried lowering the music by tiny amount. Will lowering it more be even stronger? In particular will lowering it by a semitone work even better? Then it would suggest it's about the pitch rather than the tuning.
Second, they did first 440 and then 436. There will be a familiarisation effect. 2nd time in the same situation with the same music could likely less stressful. But yeah they mention they want to do randomization next.
Third, music is a deeply cultural thing. It may be that it's not objectively soothing, but rather subjectively soothing for people that grew up in certain culture and learned certain subconscious assocations. Some people like music that other people hate. This could quite plausible apply to even the tuning.
I basically rewrote the whole thing in an edit so yeah.
> First the most nitpicky but easiest to solve: They tried lowering the music by tiny amount. Will lowering it more be even stronger? In particular will lowering it by a semitone work even better? Then it would suggest it's about the pitch rather than the tuning.
Also, the way they lowered the pitch was to slow the music. So the change in tempo could also have an effect.
> Second, they did first 440 and then 436. There will be a familiarisation effect. 2nd time in the same situation with the same music could likely less stressful. But yeah they mention they want to do randomization next.
They had two groups, one which did 432 Hz first and one which did 440 Hz first.
In the second video he gives a very interesting reason:
Given 440 is what we’re all used to on an everyday basis, the effects of the OP experiment might be explained by the “novelty” of how 432 sounds to us.
They are testing a shitton of variables so it's highly likely that by pure chance something "statistically significant" would appear. But not even that happened (there was NO significant change at all), so the recommendation of a "more powerful study" is absolutely baseless.
If anything, the conclusion should be "we wasted our time so you don't have to waste yours in the future". Nothing to be ashamed of.
Otherwise this is how you start another bunch of "science reproducibility crisis!".
> Maybe because its familiar but different at the same time.
I would bet on this. That's why a lot of people like the concerts versions, instead of just listening to the studio version. It's the same... but feels new again, thanks to those small variations and imperfections from a live performance.
Still, I find it interesting they did the research for this, I love myth-busting, to know if urban legends are true or false with the foundations of data.
I wonder if for people with Absolute pitch this tuning could mess with them.
With strings the same instrument will play, respond, and end up producing a significantly different response profile in respect to both the playability and the harmonic tonal profile [0], depending on the tension of a particular diameter & composition of string and its resulting frequency.
You also have the variable of whether the string instruments simply are retuned before testing alternate standards, or if the full reselection of string gages optimized for the instrument dimensions & current pitch have been accomplished for each particular tuning standard before testing.
Keep in mind that this is quite a task on a full 88-key piano where more than one identical string is hammered per note so there are over 200 strings to replace after 88 optimizations for any given tuning pitch standard. And you will need to have all posible string gages indivdually stocked in abundance, not merely commercially available standard replacement string sets.
To a point, the higher the string tension can produce higher decibel output from equivalent instrument operation techniques, and this is what musicians really notice most even on piano when their fingers and/or bows are not in direct contact with the strings.
And imagine if you will, the look on the face of a Stradivarius owner when the first set of proper steel strings arrive and are set up correctly. Then again you wouldn't want to tighten the stings too much and break the irreplaceable fiddle.
Seems to me the pitch inflation over the centuries follows the technology of manufacturing higher-tension strings gradually more reliably over time, this was a slow multi-generational process but it seems shockingly fast when the advances are applied to established instruments which are considered far more static and fully standardized already.
Eventually if you push the limits to the max you are going to exceed the sweet spot, maybe not for every instrument but the whole orchestra may not benefit, and if you have a good idea of how sweet it was you would know what to go back to.
And this is disregarding vocalists (or soloists sometimes with a Stradivarius) who are often divas that must be catered to at the expense of everyting else.
Remember perfect pitch today would be badly imperfect in any non-440 environment.
And you can only imagine how someone "born with perfect pitch" in a pre-440 world felt after the standard got away from them.
33 people and that goddamn awful significance testing? And not even randomized? Pure citation bait.
Edit: to highlight the problems of non-randomizing: it's not just the frequency that changed between experimental conditions. The conclusion could just as well have been "heart rate drops on second hearing" or "everyone's heart rate drops 4.8bpm in one day". I think that especially that last conclusion will make you think it can't be true, but as a hypothesis, it has the same status as the 440/432 manipulation.
Sure you can: 'We found this glow-in-the-dark sludge near a meteorite site. We injected this sludge in 33 healthy male human adults (Homo Sapiens Sapiens) and played loud music. A week later (with mean 168.5 hours and std. dev. 1 hour) they all died. Our hypothesis is that the loud music burst their eardrums and they bled out. We confirm this with a statistically significant test, p=1. We conclude all test-subjects are dead.'
Or maybe when they heard the same music for a second day in a row, it was familiar, the anticipation or excitement of wondering what comes next was reduced, the interest level was lower, and that led to the effect. This study doesn't seem very scientific.
Historic tuning references can be found in antique tuning forks (corrosion effects?) and organ pipes. A440 is a modern beast. A415 is used by many Baroque players. A392 may have been in use by the clavecinistes in the 18th century.
Then there's a multitude of ways to split the octave.
As far as ET is concerned, a change in key is simply shifting the register by a number of semitones. In non ET tunings, the interval configurations change with the keys.
With some mean tone tunings, you retune some strings to accommodate different keys.
Lots of people here mentioning that concert pitch wasn't always A = 440 Hz. Here's an interesting 12tone video explaining how we ended up at exactly that: https://youtu.be/BzznBt8tVnI (Spoiler: It involves nothing less than the Treaty of Versailles.)
This sort of study might be facilitated by something like Apple Watch opt-in health research. You would need thousands of participants to get anything significant. I would suppose that there isn't a significant effect of A=432Hz.
It's more about the number of students than the quality of the science. More science being done = more bad science. There's reason to be skeptical of overall quality, but I would hesitate to rush to any judgements based solely on what makes it into your news feed.
From the second link:
"C=256 has a uniquely defined astronomical value, as a Keplerian interval in the solar system."
and, in conclusion:
"If we arbitrarily changed the "tuning" of the solar system in a similar way, it would explode and disintegrate! God does not make mistakes: Our solar system functions very well with its proper tuning, which is uniquely coherent with C=256. This, therefore, is the only scientific tuning."
There's no way this is true though. I'd be willing to wager 10-to-1 that this isn't reproducible in a larger study. Also, this isn't much of a "finally" since it appears to be from 2019.
> Because of this harmonic misalignment, listening to 440 Hz music would seem to make people anxious, nervous, or aggressive, because it is not in harmony with the natural frequency of the planet earth.
> These same effects would have repercussions on human health since our DNA is sensitive to frequencies, as stated by the Professor Carlo Ventura’s team.
> Human DNA is sensitive to music and its relative frequencies to the point that it can even be reprogrammed through them. In fact, by subjecting stem cells to various frequencies it has been possible to modify their natural organic function.
The LaRouche folks are deluded, not satirists. Probably even more deluded than the mainstream society they rail against, but definitely differently deluded.
Yep, but only using 12-TET @ C4=256Hz. A “Twelve True-Fifths Tuning” is suggested though to get the magical number, similar to violin standard tuning method. [1]
"Not significant" means that "a slight decrease" is not supported by the data in the conventionally accepted way. By convention, authors should not make claims either way if the effect is not statistically significant, even if they believe the effect is real. So the sentence above is declaring that the authors aren't following the norms of scientific claims.