Sunday, August 25, 2013

Medical Image Watermarking Engineering Project



In recent times the phenomenal growth of the internet has drawn attention to the need for insuring protection and control of exchanged data. From their digital nature, multimedia documents can be duplicated, modified, transformed, and disused very easily. Exactly identical copies of digital information, be it images, text or audio, can be produced and distributed easily. Digital watermarking is a technique that provides a solution to the longstanding problems faced with copyrighting digital data. The aim of watermarking is to include subliminal information (imperceptible) in a multimedia document to ensure a security service or simply a labeling application. Digital watermarks are pieces of information added to digital data (audio, video, or still images) that can be detected or extracted later to make an assertion about the data. This information can be textual data about the author, its copyright, etc; or it can be an image itself. The information to be hidden is embedded by manipulating the contents of the digital data, allowing someone to identify the original owner, or in the case of illicit duplication of purchased material, the buyer involved. These digital watermarks remain intact under transmission / transformation, allowing us to protect our ownership rights in digital form. Thus, recovering the embedded message is possible even if the document was altered by one or more nondestructive attacks, whether malicious or not. In practice, a watermarked object may be altered either on purpose or accidentally, so the watermarking system should still be able to detect and extract the watermark. Obviously, the distortions are limited to those that do not produce excessive degradations, since otherwise the transformed object would be unusable.

The attacks could be
Additive Noise – Through the use of D/A and A/D converters or from transmission errors
Filtering - Low-pass Filtering, less image degradation, more effect on performance
Cropping – Attacker is interested in a small portion of the watermarked object
Compression - Unintentional attack appearing often in multimedia applications while distribution via internet
Rotation and Scaling – Correlation based detection and extraction fail when rotation or scaling are performed on the watermarked image
Statistical Averaging - An attacker may try to estimate the watermark and then ‘un-watermark’ the object by subtracting the estimate
Multiple Watermarking - An attacker may watermark an already watermarked object and later make claims of ownership

There are some desirable characteristics that a watermark should possess
Imperceptible
An unmarked image is passed through a perceptual analysis block that determines how much a certain pixel can be altered such that the resulting watermarked image is indistinguishable from the original. This takes into account the human eye sensitivity to changes in flat areas and its relatively high tolerance to small changes in edges. If the watermarked image and the original image are perceptually indistinguishable the image is called imperceptible. A watermark is called perceptible if its presence in the marked signal is noticeable like in case of visible watermarking.
Robustness
The ability of watermark to withstand with the modifications (compression, rotation, noise) is called its robustness. The watermark should be resilient to standard manipulations of unintentional as well as intentional nature. It should be statistically irremovable and should withstand multiple watermarking to facilitate traitor tracing.
Capacity
The number of bits that can be embedded into the particular cover image with low error visibility is called capacity of watermark. Watermarking capacity is determined by invisibility and    robustness requirements.

Classification of Watermarks
It is not possible to have a universal watermarking algorithm which can cater to the needs of all the applications. So based on the requirements of the application we can classify watermarks with their different properties.  Watermarks may be visible, in which case their use is two-fold which includes discouraging unauthorized usage, and also act as an advertisement. However, the focus is on invisible watermarks, as they do not cause any degradation in the aesthetic quality or in the usefulness of the data. They can be detected and extracted later to facilitate a claim of ownership, yielding relevant information as well.  Watermarks can also be classified with reference to the level of robustness to image changes & alterations. They can be divided into 3 main categories: Fragile, Semi-fragile & Robust. Fragile watermarks are designed to detect even the slightest modifications made to an image. Semi-fragile watermarks are designed to withstand certain legitimate modifications but to detect malicious ones. If the image undergoes severe modifications & degradation, including analog-to-digital & digital-to-analog conversions, cropping, scaling, etc. then a Robust watermark is used.

Structure of a typical watermarking system
There are 3 main processes involved in watermarking
Insertion of a watermark
Detection of a watermark
Removal of a watermark
Extracting the watermark can be divided into two phases
Locating the watermark
Recovering the watermark information
A watermarked detection unit consists of an extraction unit to first extract the watermark, and later compare it with the original watermark inserted. The output is ‘Yes’ or ‘No’ depending on whether the watermark is present. Image watermarking depends on the domain in which the watermarking is done – the spatial and frequency domains. Watermarking in the spatial domain involves selecting the pixels to be modified based on their location within the image and is very susceptible to cropping and the mosaic attack Watermarking in the frequency domain involves selecting the pixels to be modified based on the frequency of occurrence of that particular pixel. This is to overcome the greatest disadvantage of techniques operating in the spatial domain i.e. susceptibility to cropping.


Least Significant Bit Substitution [Spatial Domain][Fragile]
The most straight-forward method of watermark embedding would be to embed the watermark into the least-significant-bits of the image. In this method, a smaller object may be embedded multiple times. Even if most of these are lost due to attacks, a single surviving watermark would be considered a success. It may survive transformations such as cropping; any addition of noise but lossy compression is likely to destroy the watermark. An improvement on basic LSB substitution would be to use a pseudo-random number generator to determine the pixels to be used for embedding based on a given “seed” or key. To detect the watermark, each key is used to generate its PN sequence, which is then correlated with the entire image. If the correlation is high, that bit in the watermark is set to “1”, otherwise a “0”. The process is then repeated for all the values of the watermark. CDMA improves on the robustness of the watermark significantly, but requires several orders more of calculation. It is generally preferable to hide watermarking information in noisy regions and edges of images, rather than in smoother regions. The benefit is two-fold; Degradation in smoother regions of an image is more noticeable to the HVS (humane visual system), and becomes a prime target for lossy compression schemes. But it is not possible to identify such region in spatial domain.
Threshold-Based Correlation in DCT mid-band [Frequency Domain][Robust]
The DCT allows an image to be broken up into different frequency bands, making it much easier to embed watermarking information into the middle frequency bands of an image. The middle frequency bands are chosen such that they have minimize they avoid the most visual important parts of the image (low frequencies) without over-exposing (if we embed in the high frequency band )themselves to removal through compression and noise attacks. FL is used to denote the lowest frequency components of the block, while FH is used to denote the higher frequency components. FM is chosen as the embedding region as to provide additional resistance to lossy compression techniques, while avoiding significant modification of the cover image. For each 8x8 block x,y of the image, the DCT for the block is first calculated. In that block, the middle frequency components FM are added to the pn sequence W, multiplied by a gain factor k. Coefficients in the low and middle frequencies are copied over to the transformed image unaffected. Each block is then inverse-transformed to give us our final watermarked image IW. For detection, the image is broken up into those same 8x8 blocks, and a DCT performed. The same PN sequence is then compared to the middle frequency values of the transformed block. If the correlation between the sequences exceeds some threshold T, a “1” is detected for that block; otherwise a “0” is detected. Again k denotes the strength of the watermarking, where increasing k increases the robustness of the watermark at the expense of quality.

MEDICAL IMAGE WATERMARKING

Hiding patient data in the medical image is one of the applications of digital image watermarking. The patient data in the electronic format is called Electronic patient record (EPR). The medical images with EPR attached to them can be sent to the clinicians residing at any corner of the globe for the diagnosis. Thus Medical Image Watermarking plays a vital role in the field of Telemedicine.
Attacks on Medical Images
All patients records, electronic or not, linked to medical secrecy, must be kept confidential. Because of the sensitive nature of the data, the first and the foremost requirement is that any additional information which is being embedded in the medical image must not affect its perceptual quality. Medical image watermarking is done because of mainly two reasons- increase the security, to verify integrity of medical images.
The attacks on medical images can be broadly classified into 4 main categories
Interruption:  An attack on availability. Information is destroyed or becomes unavailable or unusable.
Interception:  An attack on confidentiality. An unauthorized party gains access to information.
Modification:  An attack on integrity. An unauthorized party not only gains access to, but also tampers with information.
Fabrication:  An attack on authenticity. An unauthorized party inserts counterfeit objects into the system.
To avoid above mentioned attacks while transmission of medical images are watermarked using certain algorithms.
Need For Compression
Medical images are acquired and stored digitally especially for grayscale diagnostic imagery which has applications in radiology. These images are of typically large size and also large in number. Efficient compression makes it possible to increase the speed of transmission and reduce the cost of storage. The long term storage and mobile transmission of large size images is prohibitive, no compression is used. A typical size mammogram may be digitized at 2048 x 2048 pixels at 16 Bpp, leading to a file which is over 8 Megabytes in size if no compression is used.  For cost-effective wireless transmission, compression must be used to discard some of the redundant image data to meet the mobile bandwidth constraint. This typically involves the use of the widely accepted Joint Picture Experts Group (JPEG) standards. The most commonly used of these is lossy baseline JPEG. Images with slowly varying scene content and high correlation can be compressed efficiently as the image information can be concentrated into few coefficients in the frequency or transform domain.  But, here the images we use contain high contrast edges and high levels of detail. More information must be retained in order to effectively reconstruct important picture information. Despite impeccable quality most of the time, lossy compression can introduce false information or artifacts such as ringing and blurring which become apparent at very low bit rates.

A watermarking technique to withstand acceptable levels of JPEG compression for ease of transmission is needed. Also, to ensure diagnostic integrity of these crucial regions, a multiple watermarking technique could be used that would verify the integrity of the ROI prior to diagnosis. But such a technique should be designed for robustness to acceptable levels of baseline JPEG compression so that it is compatible with most digital imaging systems that already employ the standard in their hardware and software infrastructures.
Region Based Compression
Region of Interest (ROI) based compression schemes identify regions of images that are determined by some criterion to be of highest clinical importance. The ROI is typically compressed using a lossless or near-lossless technique while the Region of Backgrounds (ROB) can be compressed with greater loss to that of the ROI. Care must be taken while we perform the segmentation & compression of medical imagery because the diagnostically important regions must be preserved at high quality, while the rest of the image is important in a contextual sense & is used to assists the viewer to observe the position of the ROI within the original image.
Critical feature information is extracted from the ROI that can be used a signature. To avoid perceptual degradation of the crucial diagnostic region, robust watermarking needs to be used in which watermarking is done around the ROI into the Region of Backgrounds (ROB) to provide authentication of these types of images. A simple method for multiple watermarking involves embedding the same authentication information in the eight regions surrounding the ROI of fewer regions if space in the ROB is unavailable. Embedding a signature in the eight ROB regions surrounding the ROI or in fewer regions if space is unavailable is needed. Similar watermarks could be used to occupy a smaller image area, which would require the capacity of the watermarking system to be increased. Multiple embedding can provide additional robustness if the image is cropped resulting in loss to some of the surrounding watermarks. The image is watermarked robustly to allow for acceptable distortions including conversion to and from spatial form as well as complete lossy JPEG encoding of the entire image to an acceptable bit rate. These include the distortions of integer rounding and DCT quantization. This type of authentication technique could be extended to any image with a critically important region that requires authentication.
Extracting the Signature
A ROI is specified at the location where the critical image information is segmented from a ROB. A signature is extracted from the low frequency DCT coefficients of the micro blocks in the ROI and embedded into higher frequency terms of the ROB as semi-fragile watermark. The signature is based on properties between randomly selected pair DCT coefficients that are invariant to jpeg compression. For each pair of DCT blocks 8 corresponding low frequency coefficients are compared to obtain 8 bits binary feature code sequence Z. consider two blocks that have been grouped Ca, Cb then the signature bit b belongs to Z is determined by relationship, where i and j are the coordinate of low frequency coefficients. Because one bit is generated from every two DCT coefficients that are compared, 8 signature bits are generated from every micro block pair in the ROI. These are multiply embedded into the ROB in the same shape as the ROI but in multiple locations.
Function which extracts signature from the ROI
Two randomly selected micro blocks are extracted from the ROI and the signature coefficients are determined. These are calculated from the first 8 coefficients following the JPEG zig-zag scan method. Corresponding coefficients are compared with each other to generate feature codes. This process is repeated until there are no longer any block pairs left.
Embedding the Watermark
The process of watermark embedding is very similar to that used by Cox et al. (2001). Four signature bits are embedded into the high frequency DCT coefficients of each micro-block in the image ROB. Let be be the value of one of the signature bits. This is embedded in the following process
Select 7 coefficients from the 28 high frequency coefficients. Let us call them C[0], C[1], C[2], …. C[6]. These coefficients are selected by the following JPEG zig-zag scan process depicted in the figure below.
The first coefficient C[0] is made equal to be i.e. the bit to be embedded.
The resulting coefficients are Cw[0], Cw[1], Cw[6].
For each micro block, 28 of the last DCT coefficients of the JPEG zigzag scan are used to host four signature bits. The lowest level function embeds one bit in a selection of 7 of these coefficients. This module is also re-used for the watermark system that extracts four bits from a block. An exception is that only extraction takes place and the central flow structure containing watermark bits is not used.
IMPLEMENTATION IN MATLAB
Signature Extraction
Randomize_blocks.m: This function randomly selects micro block pairs from the ROI as part of the signature extraction process.
Extract_signature.m: This function extracts the signature from a pre-defined (manually) ROI by comparing DCT coefficients that are invariant to JPEG compression.
Watermark Embedding
Medical_image_embed.m: Highest level function to embed a ROI watermark in an image. Embed_watermark_in_region.m: This function embeds a singular authentication signature into one image region.
Embed_four_bits_in_a_block.m: This function takes four signature bits and embeds them into one micro block in the ROB.
Embed_one_bit_in_a_block.m: This function takes one watermark bit which is embedded into a selection of 7 DCT coefficients. This is the lowest level embedding function.
Watermark Extraction
Medical_image_extract.m: Highest level function to decode the ROI watermark by extracting the signature and watermark from the received image
Extract_watermark_from_region.m: This function extracts a singular watermark from an image region in the ROB.
Extract_four_bits_from_a_block.m: This function extracts four embedded bits from a singular micro block. Similarly this re-uses the flow structure, with the exception that embedded bits are only extracted.

0 comments:

Post a Comment