EdlMax

 Products     Services       Standards       Reference     Company

Conversion of Audio Samples to Video Frames

Brooks Harris

May 8, 1998

Converting between digital audio sampling rates and video frames requires calculations sufficiently accurate to ensure maintenance of the audio/video synch relationship. Computations of this nature are best performed on computers using integer math wherever possible. The ratios between NTSC frame and audio sampling frequencies present the most demanding calculation because of the intentionally odd relationships of the NTSC frequencies. The form of PAL to audio sampling calculations can follow the NTSC examples.

NTSC Frame Rate

The NTSC standard for color transmission was originally designed to be compatible with existing B&W equipment. In particular a Color Subcarrier frequency needed to be selected which A) did not interfere with the 4.5 Mhz audio subcarrier, B) would minimize visible artifacts from color subcarrier, C) would result in vertical and horizontal scanning rates very near the existing B&W rates, and D) would conform to the constraints of existing transmission channels.The original RS-170A publication showed the results of these calculations rounded to an approximation deemed appropriate for tolerances achievable by calculators and hardware of that era. Here, the frequencies are shown to 15 points of precision.

The choices used to derive the NTSC frequencies are given by:

                   4.5 x 106  Audio Subcarrier
ƒline = _____________ = 15,734.265734265700000 Hz Horizontal Scan Frequency
                      286        Selected sub-multiple

                    ƒline
ƒfield = ___________ = 59.940059940059940 Hz Vertical Field Frequency
                  525 / 2
      Lines per frame (2 fields)

                  13 x 7 x 5   Selected sub-multiples
ƒsc = ______________ x ƒline
  =   3579545.454545450000000 Hz Color Subcarrier Frequency
                        2

or:

                      ƒline
ƒsc = 455 _________
= 3579545.454545450000000 Hz Color Subcarrier Frequency
                        2

The locked relationship between Color Subcarrier and Horizontal Scan Frequency is illustrated by:

                       2 x ƒsc
ƒline  = ______________  =
  15,734.265734265700000 Hz Horizontal Scan Frequency
                     13 x 7 x 5

We are particularly concerned with Vertical Frame Frequency, or 1/2 the Vertical Field Frequency. This can be calculated by reducing the appropriate terms in the formulas above, demonstrating how to use integer math to perform these calculations.

From the formulas above we have:

  4500000            Audio subcarrier Hz
___________
     286                   Selected sub-multiple
___________________
     525               Horizontal lines per frame
__________
       2                Horizontal lines per field

The 2 can be eliminated because we are interested in frames, not fields, and this can be reduced to:

4500000                    4500000                    30000
____________ or: ____________ or: ___________  =   29.970029970029970 Hz Vertical Frame Frequency
286 x 525                   150150                      1001

Maximum audio samples and video frames

To evaluate the requirements for performing these calculations on a computer we may ask "What is the largest audio sampling number we must handle?". Put another way, "What is the number of the last audio sample in 24 hours?".The NTCS frame rate is 29.970029970029970 FPS (30000 / 1001), or slower than nominal 30 FPS. When each frame is labeled incrementally, as with Non-drop Frame, the video frame labeled with 24:00:00:00 (23:59:59:29 + 1) will occur later than a true 24 hours.The NTSC Non-drop Frame labeling scheme labels every video frame to 23:59:59:29 + 1:
24 x 60 x 60 x 30 = 2,592,000 NTSC frames labeled by Non-drop Frame 23:59:59:29 + 1

The true elapsed seconds of 2,592,000 NTSC frames is:
2,592,000 x (1001 / 30000) = 86,486.4 Elapsed seconds of 2,592,000 NTSC framesSo, the maximum audio samples in NTSC Non-drop Frame 23:59:59:29 + 1 are:
86,486.4 x 44100 = 3,814,050,240 Samples at 44100 in NTSC Non-drop Frame "24:00:00:00"
86,486.4 x 48000 = 4,151,347,200 Samples at 48000 in NTSC Non-drop Frame "24:00:00:00"For PAL, this is a simpler calculation because the PAL video frame rate is a true 25 FPS:
44100 x 24 x 60 x 60 = 3,810,240,000 = 44100 samples in true 24 hours
48000 x 24 x 60 x 60 = 4,147,200,000 = 48000 samples in true 24 hours
25 x 24 x 60 x 60 = 2,160,000 = PAL video frames in true 24 hours

These are the numbers we must handle. The largest of them (4,151,347,200 - 48K NTSC samples) will fit in a 32 bit unsigned long integer variable (ULONG_MAX = 4,294,967,295) and so will accommodate the storage of these numbers. (Note 32 bits will NOT handle the storage of 96K sampling: 86486.4 x 96000 = 8,302,694,400).

Summary -

2,592,000 NTSC video frames in Non-drop Frame "24:00:00:00"
3,814,050,240 Samples at 44100 in NTSC Non-drop Frame "24:00:00:00"
4,151,347,200 Samples at 48000 in NTSC Non-drop Frame "24:00:00:00"
2,160,000 PAL video frames in true 24 hours
3,810,240,000 44100 samples in true 24 hours
4,147,200,000 48000 samples in true 24 hours

48000 Sample Rate to NTSC frames

Deriving our formulas from basic principles we begin with the integer values contained in the NTSC formulas and factor to reduce the equation to the simplest possible form. Beginning with the formulas above and introducing the terms for 48K sampling, we have:

   4500000                  1
____________ x ________
286 x 525                48000

This can be reduced to:

               4500                            4500                        45                                      5
___________________ or: ____________ or: __________ and finally: _______
         286 x 525 x 48                7207200                  72072                                8008

This then represents the ratio between 48K sampling and NTSC frame rate. As stated above, the last 48K sample in 24 hours of Non-drop frame NTSC is 4,151,347,200. To convert this sample number to the NTSC video frame we have:

         4151347200 x 5                        20,756,736,000
_____________________ or: _____________________ = 2,592,000 NTSC Frames
               8008                                           8008

The product of the first multiplication is a number greater than 32 bits. This can be performed with integer math if a 64 bit integer type is available on the platform.The Microsoft c/c++ compiler supports __int64. On this platform the calculation can be written:

unsigned long 48KToNTSC(unsigned long 48K_Sample_Input)
{
__int64 64Result;
64Result = (__int64) 48K_Sample_Input * 5;
64Result = 64Result / 8008;
return (unsigned long) 64Result;
}

unsigned long NTSCTo48K(unsigned long NTSC_Frames)
{
__int64 64Result;
64Result = (__int64) NTSC_Frames * 8008;
64Result = 64Result / 5;
return (unsigned long) 64Result;
}

44100 Sample Rate to NTSC frames

Similarly for 44100 Sampling, we have:

        1                4500000                            45000                                4500                                        100
________ x _____________ or: _____________________ or: ____________ and finally: ___________
    44100         286 x 525                     285 x 525 x 441                       6621615                                 147147

The last 44100 sample in 24 hours of Non-drop frame NTSC is 3,814,050,240.

          3814050240 x 100                           381,405,024,000
_________________________ or: _______________________  =   2,592,000 NTSC Frames
                 147147                                              147147

In Microsoft c/c++:

unsigned long 441ToNTSC(unsigned long 441_Sample_Input)
{
__int64 64Result;
64Result = (__int64) 441_Sample_Input * 100;
64Result = 64Result / 147147;
return (unsigned long) 64Result;
}

unsigned long NTSCTo441(unsigned long NTSC_Frames)
{
__int64 64Result;
64Result = (__int64) NTSC_Frames * 147147;
64Result = 64Result / 100;
return (unsigned long) 64Result;
}

48000 Sample Rate to PAL frames

The same computational approach can be applied to converting audio samples to PAL (25 FPS) frames.

The ratio of 48000 Sampling to PAL 25 FPS is:

     25                     1
________ or: ________
  48000               1920

From above, the number of 48000 samples in 24 hours is 4,147,200,000 .

     4,147,200,000
__________________ =  2,160,000 PAL video frames
           1920

This is a simpler and less demanding calculation than for NTSC and can be accomplished within 32 bit integers.

unsigned long 441ToPAL(unsigned long 441_Sample_Input)
{
return 441_Sample_Input / 1920;
}

unsigned long PALTo441(unsigned long PAL_Frames)
{
return PAL_Frames * 1920;
}

44100 Sample Rate to PAL frames

Similarly for 44100 Sampling to PAL video frames:

       25                   1
________ or: ________
   44100               1764

The last 44100 sample in 24 hours 3,810,240,000:

      3,810,240,000
___________________ =  2,160,000 PAL video frames
            1764

unsigned long 441ToPAL(unsigned long 441_Sample_Input)
{
return 441_Sample_Input / 1764;
}

unsigned long PALTo441(unsigned long PAL_Frames)
{
return PAL_Frames * 1764;
}

References

1. K. B. Benson and J. Whitaker, Television Engineering Handbook, McGraw-Hill, 1992.
2. A. Inglis and A. Luther, Video Engineering, McGraw-Hill, 1996.
3. EIA Tentative Standard RS170A, Color Television Studio Line Amplifier Output, 1977
4. SMPTE Standard SMPTE 170M , Composite Analog Video Signal - NTSC for Studio Applications, 1994

About the Author

Brooks Harris is President of Brooks Harris Film & Tape, Inc. (BHFT), a software development company in NYC specializing in edit data exchange. BHFT markets EDLMAX, an EDL and OMF management application, and provides consulting and custom implementations for OEM clients. Harris is a member of SMPTE and AES and a contributor to the SMPTE P18.27 Working Group on Editing Procedures and AES SC-06-01 Audio File Interchange. Harris is also a contributor to the EBU/SMPTE Task Force on the Harmonization of Data Interchange. BHFT is an OMF Champion and an AAF Adopter.