This is a collection of often used and misused technical terms regarding video, compression and networking.
Many sources contributed to this list.
If you wish to contribute, correct any mistake or just send your comments and impressions please contact :
Luigi.Filippini@crs4.it
Arithmetic coding removes this restriction by representing messages as
intervals of the real numbers between 0 and 1. Initially, the range of
values for coding a text is the entire interval [0, 1]. As encoding
proceeds, this range narrows while the number of bits required to represent
it expands. Frequently occurring characters reduce the range less than
characters occurring infrequently, and thus add fewer bits to the length of
an encoded message.
ATM
ATM (Asynchronous Transfer Mode) is a
switching/transmission technique
where data is transmitted in small, fixed sized cells (5 byte header,
48 byte payload). The cells lend themselves both to the time-division-
multiplexing characteristics of the transmission media, and the packet
switching characteristics desired of data networks. At each switching
node, the ATM header identifies a virtual path or virtual circuit
that the cell contains data for, enabling the switch to forward the
cell to the correct next-hop trunk. The virtual path is set up
through the involved switches when two endpoints wish to communicate.
This type of switching can be implemented in hardware, almost essential
when trunk speed range from 45Mb/s to 1Gb/s.
B-Y R-Y
The human visual system has much less acuity for spatial variation of
colour than for brightness. Rather than conveying RGB, it is advantageous
to convey luma in one channel, and colour information that has had luma
removed in the two other channels. In an analog system, the two colour
channels can have less bandwidth, typically one-third that of luma. In a
digital system each of the two colour channels can have considerably
less data rate (or data capacity) than luma.
Green dominates the luma channel: about 59% of the luma signal comprises green information. Therefore it is sensible, and advantageous for signal-to-noise reasons, to base the two colour channels on blue and 1red. The simplest way to remove luma from each of these is to subtract it to form the difference between a primary colour and luma. Hence, the basic video colour-difference pair is (B-Y), (R-Y) [pronounced "B minus Y, R minus Y"].
The (B-Y) signal reaches its extreme values at blue (R=0, G=0, B=1;
Y=0.114; B-Y=+0.886) and at yellow (R=1, G=1, B=0; Y=0.886; B-Y=-0.886).
Similarly, the extrema of (R-Y), +-0.701, occur at red and cyan. These
are inconvenient values for both digital and analog systems. The colour
spaces YPbPr, YCbCr, PhotoYCC and YUV are simply scaled versions of (Y,
B-Y, R-Y) that place the extrema of the colour difference channels at more
convenient values.
Bridge
Bridges are devices that connect similar and dissimilar LANs at the data link
layer (OSIlayer 2), regardless of the physical layer protocols or media being
used. Bridges require that the networks have consistent addressing
schemes and packet frame sizes. Current introductions have been termed
learning bridges since they are capable of updating node address (tracking)
tables as well as overseeing the transmission of data between two Ethernet
LANs.
Brouter
Brouters are bridge/router hybrid devices that offer the best capabilities of
both devices in one unit. Brouters are actually bridges capable of
intelligent routing and therefore are used as generic components to integrate
workgroup networks . The bridge function filters information that
remains internal to the network and is capable of supporting multiple
higher-level protocols at once.
The router component maps out the
optimal paths for the movement of data from one point on the network to
another. Since the brouter can handle the functions of both bridges and
routers, as well as bypass the need for the translation across application
protocols with gateways, the device offers significant cost reductions in
network development and integration.
CCITT
Commite' Consultatif International de Telecommunications et Telegraphy
A committee of the International Telecommunications Union responsible
for making technical recommendations about telephone and data
communication systems for PTTs and suppliers. Plenary sessions are held
every four years to adopt new standards.
CD-DA
CD-DA (Compact Disc-Digital Audio), are standard music CDs.
CD-DA began CD-ROM when people realized that you could store a
whole bunch of computer data on a 12cm optical disc (650mb). CD-ROM drives
are simply another kind of digital storage media for computers, albeit
read-only. They are peripherals just like hard disks and floppy drives.
(Incidentally, the convention is that when referring to magnetic media,
it is spelled disk. Optical media like CDs, LaserDisc, and all the other
formats are spelled disc)
CD-I
CD-I means Compact Disc Interactive. It is meant to provide a standard
platform for mass consumer interactive multimedia applications. So it is
more akin to CD-DA, in that it is a full specification for both the
data/code and standalone playback hardware: a CD-I player has a CPU,
RAM, ROM, OS, and audio/video/(MPEG) decoders built into it. Portable
players add an LCD screen and speakers/phonejacks.
It has limited motion video and still image compression
capabilities. It was announced in 1986, and was in beta test by Spring 1989
This is a consumer electronics format that uses the optical disc in combination with a computer to provide a home entertainment system that delivers music, graphics, text, animation, and video in the living room. Unlike a CD-ROM drive, a CD-I player is a standalone system that requires no external computer. It plugs directly into a TV and stereo system and comes with a remote control to allow the user to interact with software programs sold on discs. It looks and feels much like a CD player except that you get images as well as music out of it and you can actively control what happens. In fact, it is a CD-DA player and all of your standard music CDs will play on a CD-I player; there is just no video in that case.
For a CD-I disk, there may be as few as 1 or as many as 99 data tracks. The sector size in the data tracks of a CD-I disk is approximately 2 kbytes. Sectors are randomly accessible, and, in the case of CD-I, sectors can be multiplexed in up to 16 channels for audio and 32 channels for all other data types. For audio these channels are equivalent to having 16 parallel audio data channels instantly accessible during the playing of a disk.
If you want information about Philips CD-I products, you can call these numbers:
A CD-ROM has several advantages over other forms of data storage, and a few disadvantages. A CD-ROM can hold about 650 megabytes of data, the equivalent of thousands of floppy discs. CD-ROMs are not damaged by magnetic fields or the xrays in airport scanners. The data on a CD-ROM can be accessed much faster than a tape, but CD-ROMs are 10 to 20 times slower than hard discs.
You cannot write to a CD-ROM. You buy a disc with the data already
recorded on it. There are thousands of titles available.
CD-XA
CD-XA is a CD-ROM extension being designed to support digital audio
and still images.
Announced in August 1988 by Microsoft, Philips, and Sony, the CD-ROM XA (for Extended Architecture) format incorporates audio from the CD-I format. It is consistent with ISO 9660, (the volume and the structure of CD-ROM), is an application extension of the Yellow Book, and draws on the Green Book.
CD-XA defines another way of formatting sectors on a CD-ROM, including headers in the sectors that describe the type (audio, video, data) and some additional info (markers, resolution in case of a video or audio sector, file numbers, etc).
The data written on a CD-XA can still be in ISO9660 file system format and therefore be readable by MSCDEX and Unix CD-ROM file system translators. A CD-I player can also read CD-XA discs even if its own `Green Book' file system only resembles ISO9660 and isn't fully compatible. However, when a disc is inserted in a CD-I player, the player tries to load an executable application from the CD-XA, normally some 68000 application in the /CDI directory. Its name is stored in the disc's primary volume descriptor. CD-XA bridge discs, like Kodak's Photo CDs, do have such an application, ordinary CD-XA discs don't.
A CD-DA drive is a CD-ROM drive but with some of the compressed audio capabilities
found in a CD-I player (called ADPCM). This allows interleaving of audio and
other data so that an XA drive can play audio and display pictures (or other
things) simultaneously. There is special hardware in an XA drive controller
to handle the audio playback. This format came from a desire to inject some
of the features of CD-I back into the professional market.
Cell Compression (from Sun Microsystem Inc.)
Cell is a compression technique developed by SMI. The compression
algorithms, the bit-stream definition, and the decompression algorithms
are open. That is Sun will tell anybody who is interested about them .
Cell compression is similar to MPEG and H.261 in that there is a lot of
room for value-add on the compressor end. Getting the highest quality
image from a given bit count at a reasonable amount of compute is an
art. In addition the bit-stream completely defines the compression
format and defines what the decoder must do and there is less art in
the docoder.
There are two flavors of Cell: the original called Cell or CellA, and a newer flavor called CellB. CellA is designed for use many times video, where one does not mind that the encoder runs at less than real time. For example, CD-ROM playback, distance learning, video help for applications. CellB is designed for use once video where the encoder must run at real time (interactive) rates. For example, video mail and video conferencing.
Both flavors of cell use the same basic technique of representing each 4x4 pixel block with a 16-bit bitmask and two 8-bit vector quantized codebook indices. This produces a compression of 12-1 (or 8-1) since each 16 pixel block is represented by 32 bits (16-bit mask, and two 8-bit codebook indices). In both flavors, further compression is accomplished by checking current blocks against the spatially equivalent block in the previous frame. If the new block is "close enough" to the old block, the block is coded as a skip code. Consecutive skip codes are run-length encoded for further compression. Modifying the definition of close enough allows one to trade off quality and compression rates. Both version of Cell typically compress video images down to about .75 to .5 bits/pixel.
Both flavors have many similar steps in the early part of compression. For each 4x4 block, the compressor calculates the average luma of the 16 pixels. It then partions the pixels into two groups, those whose luma is above the average and those whose luma is below the average. The compressor sets the 16-bit bitmask based on which pixels are in each partition. The compressor then calculates a color to represent each partition.
In Cell, the compressor calculates an average color of each partion, it then does a vector quantization against the Cell codebook (which is just a color-map). The encoded block is the 16-bit mask and the two 8-bit colormap indices. The compressor maintains statistics about how much error each codebook entry is responsible for and how many times each codebook entry is used. It uses these numbers to adaptively refine the codebook on each frame. Changed codebooks are sent in the bitstream.
In CellB, the compressor calculates the average luma for each partition and the average chroma for the entire block. This gives two colors [Y_lo, Cb_ave, Cr_ave] and [Y_hi, Cb_ave, Cr_ave]. The pair [Y_lo, Y_hi] is vector quantized against the Y/Y codebook and the pair [Cb_ave, Cr_ave] is vector quantized against the Cr/Cb codebook. Here the encoded block is the 16-bit mask and the two 8-bit VQ indices. Both of CellB's codebooks are fixed. This allows both the compressor and decompressor to run at high-speed by using table lookups. Both codebooks are designed with the human visual system in mind. They are not just uniform partition of the Y/Y or Cr/Cb space. Each codebook has fewer than 256 entries.
Cell (or CellA) is supported in XIL 1.0 from SMI. It is part of Solaris 2.2. CellB is supported in XIL 1.1 from SMI. It will be part of Solaris 2.3 when that becomes available. Complete bitstream definitions for both flavors of cell are in the XIL 1.1 programmer's guide. There is some discussion of the CellA bitstream in the XIL 1.0 programmer's guide.
CellB was used for the SMI Scott McNealy holiday broadcast, where he
talk to the company in real-time over Sun Wide Area Network. This
broadcast reach from Tokyo Japan to Munich Germany with over 3000 known
viewers.
CIF
Common Image Format. The standardization of the structure of the samples
that represent the picture information of a single frame in digital
HDTV,
independent of frame rate and sync/blank structure.
The uncompressed bit rates for transmitting CIF at 29.97 frames/sec
is 36.45 Mbit/sec.
DPCM (Differential Pulse Code Modulation)
Differential pulse code modulation (DPCM) is a source coding scheme that
was developed for encoding sources with memory.
The reason for using the DPCM structure is that for most sources of practical
interest, the variance of the prediction error is substantially smaller than
that of the source.
DVI (Digital Video Interactive)
Digital Video Interactive (DVI) technology brings television to
the microcomputer. DVI's concept is simple: information is
digitized and stored on a random-access device such as a hard disk
or a CD-ROM, and is accessed by a computer. DVI requires
extensive compression and real-time decompression of images.
Until recently this capability was missing. DVI enables new
applications. For example, a DVI CD-ROM disk on twentieth-century
artists might consist of 20 minutes of motion video; 1,000
high-res still images, each with a minute of audio; and 50,000
pages of text.
DVI uses the YUV system, which is also used by the European
PAL color
television system.
The Y channel encodes luminance and the U and V channels encode
chrominance. For DVI, we subsample 4-to-1 both vertically and horizontally
in U and V, so that each of these components requires only 1/16 the
information of the Y component. This provides a compression from the 24-bit
RGB space of the original to 9-bit YUV space.
The DVI concept originated in 1983 in the inventive environment of the David
Sarnoff Research Center in Princeton, New Jersey, then also known as RCA
Laboratories. The ongoing research and development of television since the
early days of the Laboratories was extending into the digital domain, with
work on digital tuners, and digital image processing algorithms that could be
reduced to cost-effective hardware for mass-market consumer television.
EACEM
European Association of Consumer Electronics Manufacturers
EDTV
Extended [or Enhanced] Definition Television. A television system that offers
picture quality substantially improved over conventional 525-line or 625-line
receivers, by employing techniques at the transmitter and at the receiver
that are transparent to (and cause no visible quality degradation to) existing
525-line or 625-line receivers. One example of EDTV is the improved separation
of luminance and colour components by pre-combing the signals prior to
transmission, using techniques that have been suggested by Faroudja,
Central Dynamics and Dr William Glenn
Entropy
Entropy, the average amount of information represented by a symbol in a
message, is a function of the model used to produce that message and can be
reduced by increasing the complexity of the model so that it better reflects
the actual distribution of source symbols in the original message.
Entropy is a measure of the information contained in message, it's the
lower bound for compression.
ESAC
Economics and Statistics Advisory Committee
ESPRIT
European Strategic Programme for Research and Development in Information Technology
ETSI
European Telecommunication Standard Institute
FFT
Fast Fourier Transform
Gateway
Gateways provide functional bridges between networks by receiving protocol
transactions on a layer-by-layer basis from one protocol (SNA) and
transforming them into comparable functions for the other protocol (OSI).
In short, the gateway provides a connection with protocol translation between
networks that use different protocols. Interestingly enough, gateways,
unlike the bridge, do not require that the networks have consistent
addressing schemes and packet frame sizes. Most proprietary gateways (such
as IBM SNA gateways) provide protocol converter functions up through layer
six of the OSI, while OSI gateways perform protocol translations up through
OSI layer seven.
H.261
Recognizing the need for providing ubiquitous video services using the
Integrated Services Digital Network (ISDN), CCITT (International
Telegraph and Telephone Consultative Committee) Study Group XV established a
Specialist Group on Coding for Visual Telephony in 1984 with the objective of
recommending a video coding standard for transmission at m x 384 kbit/s
(m=1,2,..., 5). Later in the study period after new discoveries in video
coding techniques, it became clear that a single standard,
p x 64 kbit/s (p = 1,2,..., 30), can cover the entire ISDN channel
capacity. After more than five years of intensive deliberation, CCITT
Recommendation H.261, Video Codec for Audiovisual Services at p x 64 kbit/s,
was completed and approved in December 1990. A slightly
modified version of this Recommendation was also adopted for use in North
America.
The intended applications of this international standard are for videophone
and videoconferencing. Therefore, the recommended video coding algorithm has
to be able to operate in real time with minimum delay. For p = 1 or 2, due
to severely limited available bit rate, only desktop face-to-face visual
communication (often referred to as videophone) is appropriate. For p>=6,
due to the additional available bit rate, more
complex pictures can be transmitted with better quality. This is, therefore,
more suitable for videoconferencing.
HDTV
High-Definition Television. A television system with approximately twice the
horizontal and twice the vertical resolution of current 525-line and 625-line
systems, component colour coding (e.g. RGB or YCbCr) a picture aspect ratio
of 16:9 and a frame rate of at least 24 Hz.
Currently there are a number of proposed HDTV standards, including HD-MAC,
HiVision and others.
Hybrid Coder
In the archetypal hybrid coder, an estimate of the next frame to be processed
is formed from the current frame and the difference is then encoded by some
purely intraframe mechanism. In recent years, the most attention has been
paid to the motion-compensated DCT coder where the estimate is formed by a
two-dimensional warp of the previous frame and the difference is encoded
using a block transform (the Discrete Cosine Transform).
This system is the basis for international standards for videotelephony, is used for some HDTV demonstrations, and is the prototype from which MPEG was designed. Its utility has been demonstrated for video sequence, and the DCT concentrates the remaining energy into a small number of transform coefficients that can be quantized and compactly represented.
The key feature of this coder is the presence of a complete decoder within it. The difference between the current frame as represented as the receiver and the incoming frame is processed. In the basis design, therefore, the receiver must track the transmitter precisely, the decoder at the receiver and the decoder at the transmitter must match. The system is sensitive to channel errors and does not permit random access. However, it is on the order of three to four times as efficient as one that uses no prediction.
In practice, this coder is modified to suit specific application. The
standard telephony model uses a forced update of the decoded frame so that
channel errors do not propagate. When a participant enters the conversation
late or alternates between image sources, residual errors die out and a clear
image is obtained after a few frames. Similar techniques are used in
versions of this coder being developed for direct satellite television
broadcasting.
Huffman Coding
For a given character distribution, by assigning short codes to frequently
occurring characters and longer codes to infrequently occurring characters,
Huffman's minimum redundancy encoding minimizes the average number of bytes
required to represent the characters in a text.
Static Huffman encoding uses a fixed set of codes, based on a representative sample of data, for processing texts. Although encoding is achieved in a single pass, the data on which the compression is based may bear little resemblance to the actual text being compressed.
Dynamic Huffman encoding, on the other hand, reads each text twice; once to determine the frequency distribution of the characters in the text and once to encode the data. The codes used for compression are computed on the basis of the statistics gathered during the first pass with compressed texts being prefixed by a copy of the Huffman encoding table for use with the decoding process.
By using a single-pass technique, where each character is encoded on the
basis of the preceding characters in a text, Gallager's adaptive Huffman
encoding avoids many of the problems associated with either the static or
dynamic method.
IDTV
Improved Definition Television. A television system that offers picture
quality substantially improved over conventional receivers, for signals
originated in standard 525-line or 625-line format, by processing that
involves the use of field store and/or frame store (memory) techniques at the
receiver . One example is the use of field or frame memory to implement
de-interlacing at the receiver in order to reduce interline twitter compared
to that of an interlaced display . IDTV techniques are implemented entirely
at the receiver and involve no change to picture origination equipment and
no change to emission standards
IEC
International Electrotechnic Committee. A standardisation body at the
same level as ISO
Interactive videodisc
Interactive video-disc is another video related technology, using an analog
approach. It has been available since the early 1980s, and is supplied in
the U.S. primarily by Pioneer, Sony, and IBM.
ISDN
ISDN stands for "Integrated Services Digital Networks", and it's a CCITT
term for a relatively new telecommunications service package. ISDN is
basically the telephone network turned all-digital end to end, using
existing switches and wiring (for the most part) upgraded so that the
basic call is a 64 kbps end-to-end channel, with bit-diddling as needed
(but not when not needed!). Packet and maybe frame modes are thrown in
for good measure, too, in some places. It's offered by local telephone
companies, but most readily in Australia, France, Japan, and Singapore,
with the UK and Germany somewhat behind, and USA availability rather spotty.
A Basic Rate Interface (BRI) is two 64K bearer ("B") channels and a single
delta ("D") channel. The B channels are used for voice or data, and the D
channel is used for signaling and/or X.25 packet networking. This is the
variety most likely to be found in residential service.
Another flavor of ISDN is Primary Rate Interface (PRI). Inside the US, this
consists of 24 channels, usually divided into 23 B channels and 1 D channel,
and runs over the same physical interface as T1. Outside of the US then PRI
has 31 user channels, usually divided into 30 B channels and 1 D channel.
It is typically used for connections such as one between a PBX and a CO or
IXC.
Letter-box
A television system that limits the recording or transmission of useful
picture information to about three-quarters of the available vertical picture
height of the distribution format (e.g. 525-line) in order to offer program
material that has a wide picture aspect ratio
Luma (Y)
Video originates with linear-light (tristimulus) RGB primary components,
conventionally contained in the range 0 (black) to +1 (white).
From the RGB triple, three gamma-corrected primary signals are computed;
each is essentially the 0.45-power of the corresponding tristimulus value,
similar to a square-root function.
In a practical system such as a television camera, however, in order to minimize noise in the dark regions of the picture it is necessary to limit the slope (gain) of the curve near black. It is now standard to limit gain to 4.5 below a tristimulus value of +0.018, and to stretch the remainder of the curve to place the Y-intercept at -0.099 in order to maintain function and tangent continuity at the breakpoint:
The luma coefficients are also a function of the white point (or chromaticity of reference whitex). Computer users commonly have a white point with a colour temperature in the range of 9300 K, which contains twice as much blue as the daylight reference CIE D65 used in television. This is reflected in pictures and monitors that look too blue.
Although television primaries have changed over the years since the adoption of the NTSC standard in 1953, the coefficients of the luma equation for 525 and 625 line video have remained unchanged. For HDTV, the primaries are different and the luma coefficients have been standardized with somewhat different values.
Lempel-Ziv Welch (LZW) Compression
Algorithm used by the Unix compress command to reduce the
size of files, eg. for archival or transmission. The
algorithm relies on repetition of byte sequences (strings) in
its input. It maintains a table mapping input strings to
their associated output codes. The table initially contains
mappings for all possible strings of length one. Input is
taken one byte at a time to find the longest initial string
present in the table. The code for that string is output and
then the string is extended with one more input byte, b. A
new entry is added to the table mapping the extended string to
the next unused code (obtained by incrementing a counter).
The process repeats, starting from byte b. The number of bits
in an output code, and hence the maximum number of entries in
the table is usually fixed and once this limit is reached, no
more entries are added.
Model-Based Coder
Communicating a higher-level model of the image than pixels is an active area
of research. The idea is to have the transmitter and receiver agree on the
basic model for the image; the transmitter then sends parameters to
manipulate this model in lieu of picture elements themselves. Model-based
decoders are similar to computer graphics rendering programs.
The model-based coder trades generality for extreme efficiency in its
restricted domain.
Better rendering and extending of the domain are research themes.
Modem (Modulator/demodulator)
An electronic device for converting
between serial data (typically RS-232) from a computer and an
audio signal suitable for transmission over telephone lines.
The audio signal is usually composed of silence (no data) or
one of two frequencies representing 0 and 1. Modems are
distinguished primarily by the baud rates they support which
can range from 75 baud up to 19200 and beyond.
Data to the computer is sometimes at a lower rate than data from
the computer on the assumption that the user cannot type more
than a few characters per second. Various data compression
and error algorithms are required to support the highest
speeds. Other optional features are auto-dial (auto-call) and
auto-answer which allow the computer to initiate and accept
calls without human intervention.
NAB
National Association of Broadcasters
NHK
Nippon Hoso Kyokai, principal japanese broadcaster
NTSC (National Television System Committee)
USA video standard with image format 4:3, 525 lines, 60 Hz and
4 Mhz video bandwidth with a total 6 Mhz of video channel width.
NTSC uses YIQ
OSI
The Open Systems Interconnection Reference Model was formally initiated by
the International Organization for Standardization (ISO) in March, 1977, in
response to the international need for an open set of communications
standards. OSI's objectives are:
The physical and data link layers provide the same functions as their SNA counterparts (physical control and data link control layers). The network layer selects routing services, segments blocks and messages, and provides error detection, recovery, and notification.
The transport layer controls point-to-point information interchange, data packet size determination and transfer, and the connection/disconnection of session entities.
The session layer serves to organize and synchronize the application process dialog between presentation entities, manage the exchange of data (normal and expedited) during the session, and monitor the establishment/release of transport connections as requested by session entities.
The presentation layer is responsible for the meaningful display of information to application entities.
More specifically, the presentation layer identifies and negotiates the
choice of communications transfer syntax and the subsequent data conversion
or transformation as required. The application layer affords the
interfacing of application processes to system interconnection facilities to
assist with information exchange. The application layer is also responsible
for the management of application processes including initialization,
maintenance and termination of communications, allocation of costs and
resources, prevention of deadlocks, and transmission security.
PAL (Phase Alternating Line)
European video standard with image format 4:3, 625 lines, 50 Hz and
4 Mhz video bandwidth with a total 8 Mhz of video channel width.
PAL uses YUV.
QCIF
Quarter Common source Intermediate Format (1/4 CIF , e.g. 1180*144)
The uncompressed bit rates for transmitting QCIF at 29.97 frames/sec
is 9.115 Mbit/s.
Region Coding
Region Coding has received attention because of the ease with which it can be
decoded and the fact that a coder of this type is used in Intel's Digital
Video Interactive system (DVI), the only commercially available system designed
expressly for low-cost, low-bandwidth multimedia video. Its operation is
relatively simple. The basic design is due to Kunt.
Envision a decoder that can reproduce certain image primitives well. A typical set might consist of rectangular areas of constant color, smooth shaded patches and some textures. The image is analyzed into regions that can be expressed in terms of these primitives. The analysis is usually performed using a tree-structured decomposition where each part of the image is successively divided into smaller regions until a patch that meets either the bandwidth constraints or the quality desired can be fitted. Only the tree description and the parameters for each leaf need then be transmitted. Since the decoder is optimized for the reconstruction of these primitives, it is relatively simple to build.
To account for image data that does not encode easily using the available primitives, actual image data can also be encoded and transmitted, but this is not as efficient as fitting a patch.
This coder can also be combined with prediction (as it is in DVI), and the predicted difference image can then be region coded. A key element in the encoding operation is a region growing step where adjacent image patches that are distinct leaves of the tree are combined into a single patch. This approach has been considered highly asymmetric in that significantly more processing is required for encoding/analysis than for decoding. It is harder to grow a tree than to climb one.
While hardware implementations of the hybrid
DCT coder have been built for extremely low bandwidth teleconferencing
and for HDTV, there is no hardware for a region coder. However, such an
assessment is deceptive since much of the processing used in DVI compression
is in the motion predictor, a function common to both methods. In fact, all
compression schemes are asymmetric, the difference is a matter of degree
rather than one of essentials.
Repeater
Repeaters are transparent devices used to interconnect segments of an
extended network with identical protocols and speeds at the physical layer
(OSI layer 1). An example of a repeater connection would be the linkage
of two carrier sense multiple access/collision detection (CSMA/CD) segments
within a network.
Router
Routers connect networks at OSI layer 3. Routers interpret packet contents
according to specified protocol sets, serving to connect networks with the
same protocols (DECnet to DECnet, TCP/IP (Transmission Control
Protocol/Internet Protocol) to TCP/IP). Routers are
protocol-dependent; therefore, one router is needed for each protocol used by
the network. Routers are also responsible for the determination of the best
path for data packets by routing them around failed segments of the network.
SECAM (Sequentiel Coleur A Memoire)
European video standard with image format 4:3, 625 lines, 50 Hz and
6 Mhz video bandwidth with a total 8 Mhz of video channel width.
SMPTE
SMPTE is the Society of Motion Picture and Television Engineers.
There is an SMPTE time code standard (hr:min:sec:frame) used to identify
video frames.
SNA
Systems network Architecture entered the market in 1974 as a hierarchical,
single-host network structure. Since then, SNA has developed steadily
in two directions. The first direction involved tying together mainframes
and unintelligent terminals in a master-to-slave relationship. The
second direction transformed the SNA architecture to support a
cooperative-processing environment, whereby remote terminals link up with
mainframes as well as each other in a peer-to-peer relationship (termed Low
Entry Networking (LEN) by IBM). LEN depends on the implementation of
two protocols: Logical Unit 6.2, also known as APPC, and Physical Unit 2.1
which affords point-to-point connectivity between peer nodes without
requiring host computer control.
The SNA model is concerned with both logical and physical units. Logical
units (LUs) serve as points of access by which users can utilize the network.
LUs can be viewed as terminals that provide users access to application
programs and other services on the network. Physical units (PUs) like
LUs are not defined within SNA architecture, but instead, are representations
of the devices and communication links of the network.
Standard bodies
Any country have national standard body where experts from
industry and universities develop standards for all kinds of
engineering problems. Among them are, for instance,
The International Organization for Standardization, ISO, in Geneva is the head organization of all these national standardization bodies. Together with the International Electrotechnical Commission, IEC, ISO concentrates its efforts on harmonizing national standards all over the world. The results of these activities are published as ISO standards. Among them are, for instance, the metric system of units, international stationery sizes, all kinds of bolt nuts, rules for technical drawings, electrical connectors, security regulations, computer protocols, file formats, bicycle components, ID cards, programming languages, International Standard Book Numbers (ISBN), ... Over 10,000 ISO standards have been published so far and you surely get in contact with a lot of things each day that conform to ISO standards you never heard of. By the way, "ISO" is not an acronym for the organization in any language. It's a wordplay based on the English/French initials and the Greek-derived prefix "iso-" meaning "same".
Within ISO, ISO/IEC Joint Technical Committee 1 (JTC1) deals with information technology.
The International Telecommunication Union, ITU, is the United Nations specialized agency dealing with telecommunications. At present there are 164 member countries. One of its bodies is the International Telegraph and Telephone Consultative Committee, CCITT. A Plenary Assembly of the CCITT, which takes place every few years, draws up a list of 'Questions' about possible improvements in international electronic communication. In Study Groups, experts from different countries develop 'Recommendations' which are published after they have been adopted. Especially relevant to computing are the V series of recommendations on modems (e.g. V.32, V.42), the X series on data networks and OSI (e.g. X.25, X.400), the I and Q series that define ISDN, the Z series that defines specification and programming languages (SDL, CHILL), the T series on text communication (teletext, fax, videotext, ODA) and the H series on digital sound and video encoding.
Since 1961, the European Computer Manufacturers Association, ECMA, has
been a forum for data processing experts where agreements have been
prepared and submitted for standardization to ISO, CCITT and other
standards organizations.
Sub band coding
Sub-band coding for images has roots in work done in the 1950s by Bedford
and on Mixed Highs image compression done by Kretzmer in 1954.
Schreiber and Buckley explored general two channel coding of
still pictures where the low
spatial frequency channel was coarsely sampled and finely quantized and the
high spatial frequency channel was finely sampled and coarsely quantized.
More recently, Karlsson and Vetterli have extended this to
multiple subbands.
Adelson et al. have shown how a recursive subdivision called a pyramid
decomposition can be used both for compression and other useful image
processing tasks.
A pure sub-band coder performs a set of filtering operations on an image to divide it into spectral components. Usually, the result of the analysis phase is a set of sub-images, each of which represents some region in spatial or spatio-temporal frequency space. For example, in a still image, there might be a small sub-image that represents the low-frequency components of the input picture that is directly viewable as either a minified or blurred copy of the original. To this are added successively higher spectral bands that contain the edge information necessary to reproduce the original sharpness of the original at successively larger scales. As with DCT coder, to which it is related, much of the image energy is concentrated in the lowest frequency band.
For equal visual quality, each band need not be represented with the same
signal-to-noise ratio; this is the basis for sub-band coder compression. In
many coders, some bands are eliminated entirely, and others are often
compressed with a vector or lattice quantizer. Succeedingly higher frequency
bands are more coarsely quantized, analogous to the truncation of the high
frequency coefficients of the DCT. A sub-band decomposition can be the
intraframe coder in a predictive loop, thus minimizing the basic distinctions
between DCT-based hybrid coders and their alternatives.
T1Q1.5
The T1Q1.5 Video Teleconferencing/Video Telephony (VTC/VT) ANSI Subworking
Group (SWG) was formed to draft a performance standard for digital video.
Important questions were asked, relating to video digital performance
characteristics of video teleconferencing/video telephony :
A trellis is a transition diagram, that takes time into account, for a finite state machine. Populating a trellis means specifying output symbols for each branch, specifying an initial state yields a set of allowable output sequences.
A trellis coder is defined as follows: given a trellis populated with
symbols from an output alphabet and an input sequence x of length n, a
trellis coder outputs the sequence of bits corresponding to the output
sequence x that maximizes the SNR of the encoding.
X.25
A standard networking protocol suite approved by the CCITT and
ISO. This protocol suite defines standard physical, link, and
networking layers (OSI layers 1 through 3). X.25 networks are in
use throughout the world.
X.400
The set of CCITT communications standards covering mail
services provided by data networks.
YCC (Kodak PhotoCD [tm])
Kodak's PhotoYCC colour space (for PhotoCD) is similar to YCbCr, except
that Y is coded with lots of headroom and no footroom, and the scaling of
Cb and Cr is different from that of Rec. 601-1 in order to accommodate a
wider colour gamut:
For Rec. 601-1 coding in eight bits per component,
CCIR-601-1 Rec. calls for two-to-one horizontal subsampling of Cb and Cr, to achieve 2/3 the data rate of RGB with virtually no perceptible penalty. This is denoted 4:2:2. A few digital video systems have utilized horizontal subsampling by a factor of four, denoted 4:1:1. JPEG and MPEG normally subsample Cb and Cr two-to-one horizontally and also two-to-one vertically, to get 1/2 the data rate of RGB. No standard nomenclature has been adopted to describe vertical subsampling. To get good results using subsampling you should not just drop and replicate pixels, but implement proper decimation and interpolation filters.
YCbCr coding is employed by D-1 component digital video equipment.
YPbPr
If three components are to be conveyed in three separate channels with
identical unity excursions, then the Pb and Pr colour difference
components are used:
YPbPr is part of the CCIR Rec. 709 HDTV standard, although different luma coefficients are used, and it is denoted E'Pb and E'Pr with subscript arrangement too complicated to be written here.
YPbPr is employed by component analog video equipment such as M-II and
BetaCam; Pb and Pr bandwidth is half that of luma.
YIQ
The U and V signals above must be carried with equal bandwidth, albeit
less than that of luma. However, the human visual system has less spatial
acuity for magenta-green transitions than it does for red-cyan. Thus, if
signals I and Q are formed from a 123 degree rotation of U and V
respectively [sic], the Q signal can be more severely filtered than I (to
about 600 kHz, compared to about 1.3 MHz) without being perceptible
to a viewer at typical TV viewing distance. YIQ is equivalent to YUV with
a 33 degree rotation and an axis flip in the UV plane. The first
edition of W.K. Pratt "Digital Image Processing", and presumably other
authors that follow that bible, has a matrix that erroneously omits the
axis flip; the second edition corrects the error.
Since an analog NTSC decoder has no way of knowing whether the encoder was encoding YUV or YIQ, it cannot detect whether the encoder was running at 0 degree or 33 degree phase. In analog usage the terms YUV and YIQ are often used somewhat interchangeably. YIQ was important in the early days of NTSC but most broadcasting equipment now encodes equiband U and V.
The D-2 composite digital DVTR (and the associated interface standard)
conveys NTSC modulated on the YIQ axes in the 525-line version and
PAL
modulated on the YUV axes in the 625-line version.
YUV
In composite NTSC, PAL or S-video systems, it is necessary to scale (B-Y)
and (R-Y) so that the composite NTSC or PAL signal (luma plus modulated
chroma) is contained within the range -1/3 to +4/3. These limits reflect
the capability of composite signal recording or transmission channel. The
scale factors are obtained by two simultaneous equations involving both
B-Y and R-Y, because the limits of the composite excursion are reached at
combinations of B-Y and R-Y that are intermediate to primary colours. The
scale factors are as follows:
It is conventional for an NTSC luma signal in a composite environment (NTSC or S-video) to have 7.5% setup :
The two signals Y (or Y_setup) and C can be conveyed separately across an S-video interface, or Y and C can be combined (encoded) into composite NTSC or PAL:
The following is a list of persons whose material contributed to the creation of this list :