normalize specfile, or: Howto find proper level before pvoc synth?

Started by p8rpp, March 16, 2016, 01:00:18 PM

Previous topic - Next topic

p8rpp

Hi dear fellow,

After having exposed a specfile to various spectral domain operations I want to convert it to the time domain on the unix command line using
pvoc synth.
I am getting a warning that my levels are too hot, and some samples were clipped. pvoc synth informs me of the gain correction factor I need to apply to the specfile in order to normalize it, which I would do using
spec gain
My question is, how could I normalize a specfile in the frequency domain anyway, before having to call pvoc synth, so that I can do it in a script without user interaction? Is there a simple way to normalize a specfile?

Thank you for all ideas!
best, Peter

simonk

Hello Peter,
From experience I would tend to normalise files before using Pvoc, but it may be that only one or two samples go over. It might be worth running the synth process and listening to how bad the outcome is.
Pvoc will report any 'overs', some may not be (particularly) audible and can be fixed in a conventional editor.

All so dependent on source material!

Hope this helps
Simon

p8rpp

Hi Simon,

thank you for your kind reply. Do you normalize files before using "pvoc anal" or "pvoc synth"?
I am looking for a way to normalize while still working on .ana files, still in the frequency domain.
As described by both of us, "pvoc synth" will report overs, but I am wondering if there is a way to normalize .ana files automatically with any other cdp command before "pvoc synth", as I want to avoid the need to have user interaction in scripts.
Read as: "I want something that normalizes .ana files so that "pvoc synth" will never be out of range.

Thank you again for your ideas!

al the best, P

simonk


Hi,
I tend to prepare material before any processing, over time I've found it more efficient if trying for 'known' output, and certainly with Pvoc processes. As previously stated the source material can be so varied ! 

If i'm just exploring something, thats another matter & leaves room for serendipitous results, but even then there can often be a deal of to-ing and fro-ing to get the right result.
On a modern machine its pretty fast anyway... no setting the process off & going to bed theses days!

>Read as: "I want something that normalizes .ana files so that "pvoc synth" will never be out of range.

Like some sort of brickwall limiter?
Not to my knowledge, but someone else might have come up with a wrinkle.

Is there a reason why you don't wish to pre-prepare material?
I've been using Soundforge for all this for years, there's even a batch mode for many files. I would expect most audio editing software would do this nowadays

Regards
Simon



rwdobson

Unfortunately it is not really possible to find the true net amplitude of an analysis frame other than by resynthesising it. Bins are not truly independent (may reinforce/cancel mutually), and in principle the contribution of the analysis and resynthesis windows should also be taken into account. Any computation to find the level would probably take longer than the already efficient resynthesis.

One solution is indeed to pre-normalise the input, best to a level below digital peak, e.g. -3dB or -6dB, process, and hope. The alternative is to resynthesise to a floating point sound file (on the command line, prepend -f to the outfile name), which will preserve over-range samples without clipping, and then rescale as required.

p8rpp

Hi Richard,

thank you for your helpful reply!
I tried to resynthesize with
     pvoc synth file.ana -f file.wav

but (in comparison to without the -f flag) am getting an error message:
     Too many parameters on command line.

Am I calling the command correctly? I am on bash (Debian GNU/Linux)

Thank you again!
cheers, P

rwdobson

It needs to be literally prefixed, without a space: -ffile.wav

p8rpp

Thanks again Richard,
the literal prefix worked of course.

pvoc synth somefile.ana -fsynthesizedFloat.wav

I am wondering if it is writing a floating point wav file though. When I inspect the file propoerties using the Unix program sndfile-info it looks like a 16bit file with a max amplitude of 0dBFS, or see the output of sox(i) below.

$ sndfile-info synthesizedFloat.wav

Version : libsndfile-1.0.25

========================================
File : synthesizedFloat.wav
Length : 1752656
RIFF : 1752648
WAVE
fmt  : 16
  Format        : 0x1 => WAVE_FORMAT_PCM
  Channels      : 1
  Sample Rate   : 44100
  Block Align   : 2
  Bit Width     : 16
  Bytes/sec     : 88200
PEAK : 16
  version    : 1
  time stamp : 1458596576
    Ch   Position       Value
     0   46526          1
cue  : 28
  Count : 1
   Cue ID : 1718183539  Pos :     0  Chunk : data  Chk Start : 0  Blk Start : 0  Offset :     0
LIST : 2016
  adtl
    note : 2004
data : 1750528
End

----------------------------------------
Sample Rate : 44100
Frames      : 875264
Channels    : 1
Format      : 0x00010002
Sections    : 1
Seekable    : TRUE
Duration    : 00:00:19.847
Signal Max  : 32767 (-0.00 dB)

------------------------------------------------------------------------------------------------------------------------------------------------------------

Or using sox (called as "soxi" to give info on the file header):

$ soxi synthesizedFloat.wav

Input File     : 'synthesizedFloat.wav'
Channels       : 1
Sample Rate    : 44100
Precision      : 16-bit
Duration       : 00:00:19.85 = 875264 samples = 1488.54 CDDA sectors
File Size      : 1.75M
Bit Rate       : 706k
Sample Encoding: 16-bit Signed Integer PCM

Am I understanding or doing something wrong? Shouldn't it be a 24 or 32bit floating point wav?

Thank you so much!

best, Peter

rwdobson

Ah indeed; I will need to check the version, as it is ~supposed~ to write a float file! It was a facility incorporated many years ago; unfortunately a collision of sources at one point resulted in a regression defaulting to 16bit, and it had to be un-regressed, which was done about a year ago (it is a generic facility for all applicable output files). Somehow you have one of the older versions. It is too late to look at that tonight, but I will investigate further tomorrow.


p8rpp

Thanks again Richard,

I compared pvoc.c in the sources available from http://unstablesound.net/downloads/CDPrelease7src.zip with these from http://unstablesound.net/downloads/CDPrelease7src-linuxbeta.tar.gz and they do differ in many ways, the first line which is different for example is

OSX:
static int  sndwrite_header(int N2,long *Nchans,float *arate,float R,int D,long origsize,long *isr,int M,dataptr dz);
LINUX:
static int  sndwrite_header(int N2,int *Nchans,float *arate,float R,int D,int origsize,int *isr,int M,dataptr dz);

but I am not sure if that's somewhere around the source of the error.

cheers, P

p8rpp

Dear Richard,
is there any news on 24bit and float output for Linux? Did you manage to find the time to fix this and is it more a matter of creating a new distribution? If there is a fix in place, would that be reflected in the git codebase?
Thank you so much, it would be great to resolve this situation after all these years, which is
much appreciated!
best, P

p8rpp

Two quick updates to this topic.
As of the current version of CDP on Linux from git at https://github.com/ComposersDesktop/CDP7/ pvoc respects the bit size and analyses and resynthesises 24 and 32bit soundfile correctly.
Using the -f flag at resynthesis will also write 32bit files. Nevertheless the gain correction factor posted by pvoc synth has to be applied to the .ana source file before resynthesis to result in a 0dBFS signal. Applying this factor to the resulting .wav file results in a way to soft amplitude for me.
Feel free to move further discussion over to the mailing list at https://lists.mur.at/mailman/listinfo/cdp