The TUG-EEC-Channels Database V. 1.1
- Research Areas
Measurement System and Experiments
File Formats and Naming
The _TUG-EEC-Channels_database consists of a collection of recordings of voice radio transmissions, which were generated during flights with a general aviation aircraft. Maximum length sequences (MLS) were transmitted over the voice channel of an amplitude modulation (AM) aeronautical VHF radio and the received signals were recorded. The measurements cover a wide range of typical flight situations as well as static back-to-back calibrations.
Measurement System and Experiments
For detailed information about the applied measurement system and the conducted measurements please consult:
- K. Hofbauer, H. Hering, and G. Kubin, “A measurement system and the TUG-EEC-Channels database for the aeronautical voice radio,” in Proceedings of the 64th IEEE Vehicular Technology conference (VTC2006-Fall), (Montréal, Canada), Sept. 2006.
All recorded data is segmented into frames of a fixed time length. The length in samples depends on the sampling rate and is 63 samples at 8 kHz, 252 samples at 32 kHz and 378 samples at 48 kHz.
Every time frame consists of:
- Unique FrameID
- Transmission recording segment
- Reception recording segment
- I/Q recording segment
- Link to original sequence
- Uplink/downlink direction
- Estimated channel impulse response
- Flight parameters elevation, latitude, longitude, distance to tower, speed, and azimuth relative to line-of-sight
- Plain-text comments
For technical reasons, the frames are grouped into segments. The SegmentID is the Frame-ID of the first frame of each segment and is an integer multiple of 1.000.000. The full data set for one segment can be downloaded as one compressed zip archive (e.g. ‘2000000.zip’).
(Technically, the segments are defined as time-segments in which the two recording files and the transmission direction do not change.)
The total file size of the complete database is around 4.5 GB. If you do not want to download the entire database in one, first download the GPS-Track and the all_meta.txt.zip file. Import the GPS track into a GPS track viewer to identify the flight segments (and their timestamps) that are of interest to you. Alternatively you can directly scan the all_meta.txt file for interesting segments (speed, elevation, distance,…). Based on the timestamp you can then find in the all_meta.txt file the ID of the frames and download the corresponding segment’s zip file, which contains all the data concerning the segment.
- Meta Data Files
- Compressed Data Files (with annotation file)
The leading 7 or 8-digit number indicates the SegmentID of the respective segment.
- 1000000.zip (1000000_seg.txt)
- Contains eight files: 1000000_meta.txt, 1000000_seg.txt, 1000000_MLSbin.bin, 1000000_MLSups.bin, 1000000_TX.bin, 1000000_RX.bin, 1000000_IQ.bin, 1000000_h.bin
- 2000000.zip (2000000_seg.txt)
- 3000000.zip (3000000_seg.txt)
- 4000000.zip (4000000_seg.txt)
- 5000000.zip (5000000_seg.txt)
- 6000000.zip (6000000_seg.txt)
- 7000000.zip (7000000_seg.txt)
- 8000000.zip (8000000_seg.txt)
- 9000000.zip (9000000_seg.txt)
- 10000000.zip (10000000_seg.txt)
- 1000000.zip (1000000_seg.txt)
File Formats and Naming
The file all_meta.txt.zip is a zip-compressed header-less plain-text ASCII file containing the meta-data for every frame.
Each line of the file corresponds to one frame.
2406983;16-May-2006 12:09:24;DL;47.159416200;16.321498870;1192.653;63.920;131.091;1262.380; 6; 6;3 9 11
2406984;16-May-2006 12:09:24;DL;47.158363340;16.322143550;1187.365;63.898;127.534;1182.908; 6; 6;3 9 11
2406985;16-May-2006 12:09:25;DL;47.158363340;16.322143550;1187.365;63.898;127.534;1182.908; 6; 6;3 9 11
The fields within the frame are separated by a semicolon (;) and have the following meanings:
- Timestamp of the frame (in local time (MESZ, UTC+2))
- Transmission direction (UL for uplink (ground to aircraft) or DL for downlink (aircraft to ground)).
- Latitude of aircraft position (in Decimal degrees, WGS84, -90 <= value <= 90)
- Longitude of aircraft position (in Decimal degrees, WGS84, -180 <= value < 180)
- True altitude of aircraft (in meters above sea-level (NN))
- Speed of aircraft (in m/s, speed track slightly smoothed)
- Relative Azimuth (in Decimal degrees, angle between Line-Of-Sight and aircraft (AC) heading)
- 0 degree: AC flies away from tower
- 90 degrees: AC flies clockwise around tower
- 180 degrees: AC flies towards tower
- 270 degrees: AC flies counter-clockwise around tower
- Distance between aircraft and tower (in meters)
- Type of transmitted signal (numeric)
0: UNKNOWN (we do not know what is there)1: SILENCE (nothing is transmitted/received (PTT is on or off))3: DTMF (DTMF tone)6: MLS (Measurement signal)9: SPEECH (speech with margin)2: SILENCE_AROUND_DTMF (the short silent regions around a the DTMF tones)4: TRANS_SILENCE_TO_DTMF (Transition region)5: TRANS_DTMF_TO_SILENCE (Transition region)7: TRANS_TO_MLS (Transition region)8: TRANS_END_MLS (Transition region)10: TRANS_UNKNOWN (Transition from something to something (unspecified))
- Type of received signal (numeric)
- see 10
- Experimental situation (list of numerics, delimited by ‘white space’)
1 : ENG_OFF = Engine is off2 : ENG_SW_ON = Engine is started somewhere here3 : ENG_ON = Engine is on4 : STATIC = Aircraft is static on ground (txwy) - aircraft does not move 5 : STATIC_MP = Aircraft is static on ground (txwy) - aircraft does not move - with car/van in proximity (5m) to possibly create multipath component6 : AC_30CM_TO_TWR = Aircraft is pulled 30cm towards tower7 : AC_30CM_AW_TWR = Aircraft is pushed 30cm away from tower8 : AC_ROLL_RNW = Aircraft is probably rolling on runway (not verified - use GPS data!)9 : AC_FLIGHT = Aircraft performs flight (incl. takeoff, landing, etc. . Not verified - use GPS data!)10 : AC_TXWY = Aircraft is on the taxiway with line-of-sight (~54m) to the tower antenna. Heading towards tower.11 : EXP_GOOD = Good experimental conditions. Use these data sets if you do not need the full data set. Recording gains, etc. were set more accurately in these segments.
The tower is positioned at longitude 16.3141 (E), latitude 47.1492 (N), and elevation 290 (in above units).
The timestamp of each frame (respectively the link between recordings and meta-data) is a result from manual time readings and is therefore accurate to a couple of seconds, only. The Meta-Data results from position recordings (track points) that were taken every two seconds with a standard consumer GPS-device. The accuracy is therefore roughly 10 meters. The track-points are not interpolated; for each frame the nearest neighbour is chosen.
The identification of the type of the recorded signal in each frame (10. and 11.) was performed retrospectively based on automatic classification. Due to the size of the database the labels were not hand-checked and are therefore subject to classification errors. Nevertheless the labels provide a good starting point to extract certain parts (signal types) from the database.
The compressed zip archives contain all files that correspond to one segment (see below). The first part of the file name is the SegmentID, which corresponds to the FrameID of the first frame in the segment.
This file contains the meta information for the corresponding segment. The format is identical to the file ‘all_meta.txt’ (see above). The file ‘all_meta.txt’ is a concatenation of all the segments’ _meta.txt files.
The file SegmentID_seg.txt contains meta-information that is valid and constant for the entire segment, i.e. all the frames in the segment.
It is a header-less plain-text ASCII file with two data entries per line, the first one being a field descriptor and the second one being the actual data, delimited with a semicolon.
Sampling Frequency in Segment;48000
Sampling Frequency of Impulse Response h;8000
The following fields are given:
- Time Stamp of 0th Frame (timestamp of the first frame, the frame with FrameID equals SegmentID)
- Transmission Direction in Segment (UL or DL, see above)
- Sampling Frequency in Segment (sampling rate of the TX and RX recordings)
- Length of Frames (frame length in samples of the TX and RX recordings)
- Frames in Segment (number of frames in the segment)
- Sampling Frequency of Impulse Response h (sampling rate of the estimated impulse response)
- Length of Impulse Response h (length of the estimated impulse response in samples)
- Free text comment
This file contains the original binary Maximum Length Sequence (MLS) that was used for the measurements. Its length is 63 samples and assumes a sampling rate of 8 kHz. The file is in binary little-endian machine format and one-channel (mono), with the samples being in 32bit IEEE floating point format (single precision according IEEE754) (Matlab: ‘float32’, ‘ieee-le’).
This file contains the same MLS, but upsampled to 48 kHz. Depending on the segment, two slightly different upsampling algorithms were applied. The file format is the same as for SegmentID_MLSbin.bin.
SegmentID_TX.bin and SegmentID_RX.bin
These files are the core of the database. They are the actual transmission recording pairs with the audio recording of the transmitted signal (_TX) and the audio recording of the received signal (_RX). If the _RX file is the recording of the aircraft radio receiver or the ground radio receiver depends on the transmission direction in the current segment.
The files are in binary little-endian machine format and one-channel (mono), with the samples being in 16bit signed integer PCM format (Matlab: ‘int16’, ‘ieee-le’). The underlying sampling rate is 48 kHz and each block of 378 samples forms one frame.
Example: File 3000000_TX.bin:
- Frame 3000000: Sample 1…378 <- first 756bytes
- Frame 3000001: Samlpe 379…756
- Frame 3000002: Sample 757…1134
Although the data should be considered on a frame-by-frame basis, the RX files contain the continuous raw data of the segment as it was recorded during the measurements, including MLS transmissions, DTMF tones, voice transmissions, silence, etc. Due to the inherent (albeit small) difference in the sampling rate of the TX and RX recorders, the recordings would drift apart. In order to circumvent this, each frame of the TX file is individually selected based on best temporal alignment to the RX recording using the GPS timing signal which was recorded in parallel. As a consequence, the TX file is not equal to the original recording anymore, but has missing or inserted samples at the frame edges in order to achieve synchronisation between the RX and TX frames.
One note about gain:
The signal level of the recordings does (at least in general) not represent the transmitted or received signal levels. The signal gain was influenced by several elements in the recording chain, some of them possibly not being constant for the entire segment. (Especially in the segments 5,6,7,8 (*1000000) a number of arbitrary manual gain changes occurred in the ground recording.) Local, and especially periodic variations of signal level however do have significance.
Based on the (re-aligned) transmission pairs from above, this file contains estimated channel impulse responses for every frame. The estimation is based on FIR system identification and uses FFT cross-correlation to compute the impulse response of the channel given the input-output pairs. For numerical stability the signals were downsampled before system identification to a sampling rate of 8 kHz, as the channel was excited only up to a frequency of 4kHz.
The file is in binary little-endian machine format and one-channel (mono), with the impulse response samples being in 32bit IEEE floating point format (single precision according IEEE754) (Matlab: ‘float32’, ‘ieee-le’). The underlying sampling rate is 8 kHz and each block of 63 samples forms the impulse response of one frame.
This file contains the digital IQ data that was recorded in parallel to all transmissions.
The file is in binary little-endian machine format and interleaved two-channel (!), with the samples being in 16bit signed integer PCM format (Matlab: ‘int16’, ‘ieee-le’). The I- and Q- components are interleaved on a per data point basis, i.e. the binary samples correspond to: I(1),Q(1),I(2),Q(2),I(3),Q(3),… with I(n) and Q(n) being the real and imaginary components of the n-th IQ data point. The underlying sampling rate is 32 kHz (!) and each block of 2*252 (!) samples forms one frame.
For the IQ recording no parallel GPS timing signal is available. Additionally, due to a malfunctioning in the IQ recording system, the IQ data repeatedly lacks short blocks of data (a couple of hundred bits every a few tens of seconds). For rough synchronisation with the ground recording the DTMF tones that were embedded in the transmitted signal were used. The actual frame boundaries were then individually set so that the correlation between the (demodulated) IQ data frame and the previously determined ground recording frame is maximum. However, this synchronisation is not as accurate as the synchronisation between TX and RX frames and local errors due to the drop-outs are possible.
It was observed that the signal polarity of the IQ recording is inverted when the transmission direction is downlink, i.e. the aircraft is transmitting. This has to be considered when using the data.
The file is the original GPS track log for all flights. It is in the GPX (GPS Exchange Format), a lightweight XML data format that can be read by most GPS applications. It contains track-points with position and timestamp, which were mostly taken every two seconds. All the information is already included in the SegmentIF_meta.txt files in a more accessible format. However the gpx file becomes useful for visualizing and browsing through the flight tracks using programs such as FlightTrack (OS X) or one of the many alternatives. Also Google Earth can import and visualize GPX files, however it seems not to be able to show the time stamps of the single track points
For any questions or remarks about the database please contact Konrad Hofbauer. We highly appreciate your feedback and, as far as time allows, will try helping to solve any issue concerning the database. If on the other hand you make substantial improvements on the database (e.g. manual annotations, improved channel response estimations, …) and are willing to share your results, please let us know so that we can talk about the incorporation of your work into a future release of the database.
In simple words, feel free to download and make use of the database for research and development, also in a commercial environment. However, if you want to redistribute the database or modified versions of it, certain restrictions apply. For example, you have to distribute the resulting work under a license identical to this one and prominently state the original source of this work, including a reference to this website. If you feel like the license terms prevent you from doing something you would like to do, please contact the maintainer of the database and a simple and probably free-of-cost solution can be found.