HN Reader

How problematic is resampling audio from 44.1 to 48 kHz?

113

Changing the sample rate of audio only affects the frequency range. All audio signal is _perfectly_ represented in a digital form.

I am ashamed to admit this took me a long time to properly understand. For further reading I'd recommend:

https://people.xiph.org/~xiphmont/demo/neil-young.html https://www.youtube.com/watch?v=cIQ9IXSUzuM

26 days agoby everfrustrated

As a real world example, on Windows, unless you take exclusive access of the audio output device, everything is already resampled to 48khz in the mixer. Well, technically it gets resampled to the default configured device sample rate, but I haven't seen anything other than 48khz in at least a decade if ever. Practically this is a non-issue, though I could understand wanting bit-perfect reproduction of a 44.1 khz source.

26 days agoby adzm

> We do [cubic curve fitting] all the time in image processing, and it works very well. It would probably work well for audio as well, although it's not used -- not in the same form, anyway -- in these applications.

Is there a reason the solution that "works very well" for images isn't/can't be applied to audio?

26 days agoby MontagFTB

> it's probably worth avoiding the resampling of 44.1 to 48 kHz

Ehhm, yeah, duh? You don't resample unless there is a clear need, and even then you don't upsample and only downsample, and you tell anyone that tries to convince you otherwise to go away and find the original (analog) source, so you can do a proper transfer.

26 days agoby ZeroConcerns

I wonder if this problem could be "solved" by having some kind of "dual mode" DACs that can accept two streams of audio at different sample rates, likely 44.1khz and 48khz, which are converted to analog in parallel and then mixed back together at the analog output.

Then at the operating system level rather than mixing everything to a single audio stream at a single sample rate you group each stream that is at or a multiple of either 44.1khz or 48khz and then finally sends both streams to this "dual dac", thus eliminating the need to resample any 44.1khz or 48khz stream, or even vastly simplifying the resample of any sample rate that is a multiple of this.

26 days agoby amlib

I'm kinda shocked that there's no discussion of sinc interpolation and adapting it's theoretical need for infinite signals to some finite kernel length.

For a sampled signal, if you know the sampling satisfied Nyquist (i.e., there was no frequency content above fs/2) then the original signal can be reproduced exactly at any point in time using sinc interpolation. Unfortunately that theoretically requires an infinite length sample, but the kernel can be bounded based on accuracy requirements or other limiting factors (such as the noise which was mentioned). Other interpolation techniques should be viewed as approximations to sinc.

Sinc interpolation is available on most oscilloscopes and is useful when the sample rate is sufficient but not greatly higher than the signal of interest.

26 days agoby klaff

> In reality, the amount of precision that can actually be "heard" by the human ear probably lies between 18 and 21 bits; we don't actually know, because it's impossible to test.

This sounds contradictory - what would be the precision that can be heard in a test then?

26 days agoby fulafel

Lots of Live/Audigy era Creative sound cards would resample everything to 48kHz, with probably one of the worst quality resamplers available, to the chagrin of all bitperfect fanatics... still probably one of their best selling sound cards.

I.e. no one cares.

26 days agoby AshamedCaptain

I'm not sure I understand the "just generate it" perspective. If you want to generate a much higher sampling rate signal that has a common multiple of your input and output sampling rate, "just generating it" is going to involve some kind of interpolation, no? Because you're trying to make data that isn't there.

If you want to change the number of slices of pizza, you can't simply just make 160x more pizza out of thin air.

Personally I'd just do a cubic resample if absolutely required (ideally you don't resample ofc); it's fast and straightforward.

Edit: serves me right for posting, I gotta get off this site.

26 days agoby pixelpoet

> once we have reduced errors to below the noise floor they are inaudible by definition.

Makes me think of GPS where the signal is below the noise floor. Which still blows my mind, real RF black magic.

25 days agoby somat

Naively I would upsample by 4-8 by 0 stuffing and low pass filtering and then interpolating. That can't be that bad, can it?

26 days agoby cozzyd

For those looking to delve into this topic more, the term of art is ASRC: Asynchronous Sample Rate Conversion.

25 days agoby nivea3066

Where does dither fit into this picture?

26 days agoby bobbylarrybobby

This article cleared up so many long-standing questions for me. THANK YOU for sharing!!!

26 days agoby functionmouse

(2022)

26 days agoby HelloUsername

If you're taking something from 44.1 to 48, only 91.875% of the data is real, so 8.125% of the resulting upsampled data is invented. Some of it will correlate with the original, real sound. If you use upsampling functions tuned to features of the audio - style, whether it's music, voice, bird recordings, NYC traffic, known auditorium, etc, you can probably bring the accuracy up by several percent. If the original data already has the optimizations, it'll be closer to 92%.

If it's really good AI upsampling, you might get qualitatively "better" sounding audio than the original but still technically deviates from the original baseline by ~8%. Conversely, there'll be technically "correct" upsampling results with higher overall alignment with the original that can sound awful. There's still a lot to audio processing that's more art than science.

26 days agoby observationist