Suppose I have samples of a signal(s) + noise process which is sampled 4x what is necessary for my desired spectral resolution.
My desire is to pick the signals out of the noise, but I have little to know a-priori knowledge of the noise characteristics. If this makes the problem intractable we can assume Gaussian or fractional Gaussian.
How can I make use of the oversampled data to improve PSD estimation? Obviously I can use the extra samples to get greater frequency resolution, but that is not my desire.
In typical Welch's method type of approaches, I can divide the data into possibly overlapping windows and average periodograms. Would it be essentially equivalent to construct 4 downsampled waveforms by taking every 4th sample, starting at sample 0, then 1, 2, and 3, compute periodograms of the 4 separate downsampled waveforms then average them? Would this also reduce the variance of the periodogram estimate similar to Welch's method?
You did not describe the BW of the noise. You described its density function but not its spectral spread. Is the noise band-limited to the signal bandwidth as it would be at the output of a matched filter? or is it white noise or colored noise with spectral content out to the half sample rate?
The variance reduction you are seeking comes from averaging the spectra from essentially independent estimates of the spectrum. Normally that means 50% overlapped sliding Hann windowed signals for spectra with about 45 dB dynamic range...or 75% overlapped sliding Kaiser windowed signals for spectra with some 90-dB dynamic range. What we need is longer blocks of data to enable more overlapped intervals. Over sampling offers no benefit to the spectral estimation problem. If the data is oversampled 4-to-1 there is high correlation between adjacent time samples and the high correlation disrupts the variance reduction due to averaging. Down sampling, taking every 4-th sample starting at samples (1,2,3,4) will give you essentially the same data and will not improve stability of spectral estimate.
collect a longer run of data at Nyquist rate and then do the average of sliding windowed transform
There are at least 2 questions you posed.
In order to address them, it requires to clarify what you meant by "noise" in the "siganl(s) + noise process"?
Noise is a generic term and has many meanings and it really stands for Undesired Signals.
Undesired signals could be a Desired Signal to another user.
Also Undesired Signal, "noise", could mean Thermal noise, Quantization noise, Phase noise, etc.
By that definition, then it requires a better clarification of which noise, you are referring to.
Q1. "How can I make use of the oversampled data to improve PSD estimation?"
A1. Every time you sample data, the sampling process cause/introduces quantization noise.
You reduce the quantization noise relative to Nyquist rate by oversampling. Over sampling by x4, provides additional 6dB better (i.e. less) quantization noise in the sampling process, hence better SNR. But again, it depends how "noise" is defined.
Adding to the useful contributions from Shafie and Fred .
DFT frequency resolution = sampling rate/resolution thus I can safely assume that oversampling at ADC will increase resolution but upsampling afterwards may introduce further noise depending on method of upsampling.
PSD is a parametric estimate. "Big Box" Spectrum analysers use filter banks for direct estimation. There might be digital methods to do same.
Estimating out-of-band noise is possible but not sure about in-band noise no matter what resolution of DFT is used. Increasing resolution of DFT also introduces false noise floor reduction due to spread of same noise across more bins.
The Welch method is not the same as downsampling and averaging. Welch will keep the bandwidth and increase the width of the frequency bins. Downsampling and averaging will keep the frequency step but reduce the Nyquist frequency. I guess in your case you prefer downsampling and averaging.