The barcode reader project is intended to make tasks such as grocery shopping easier for those who are visually impaired. The application will allow individuals to scan an image of a unique barcode, and access information related to the item electronically. The product data will then be converted to a standard audio format, so that the user can have the information read aloud. By offering this service, we hope to make shopping a more practical activity for those who have trouble acquiring the information for themselves.
It is important to note that the purpose of this project is to process the UPC image and return relevant product information. We designed this project with the assumption that the individual has the ability to find and take a picture of a particular UPC, whether through their own power, or with the help of a pre-existing detection program.
Enjoy and thanks for visiting,
Our group decided to code the barcode project in J2ME (Java for mobile phones) because it is one of the most available mobile development platforms. Below is a description of what the application’s major functions, broken into stages of the normative user cycle:
- 1) A camera phone takes a picture of a UPC and sends it to the web server.
- 2) The server processes in the image and returns the unique UPC number.
- 3) This number is cross-referenced with pre-existing databases and product information is retrieved.
- 4) Product information is returned as text, which is then converted to audio on the server.
- 5) The audio file is sent back down to the phone, which the user can play to hear the information.
The application is written in Java, and makes use of an Apache TomCat Web Server running on aludra. To pass data between the client and server, we opted for Java Server Pages (JSP) which allow us to embed real Java code in HTML files.
Taking Pictures and Transferring Them to the Server
The first step in the application is to acquire the image of the barcode. For our prototype, the user interface is fairly straight-forward – allowing the user to take a snap shot of a product UPC. To simulate this, we feed the emulator a sample video of a barcode and then take a single-frame snapshot to upload to the server. The interface currently has two working options: “capture” and “exit”.
Once the user takes the picture, the Java application turns the image into a byte array and uploads the image data to the TomCat web server.
Server Image Processing and Number Recognition
Once on the server, the image needs to be processed so that a unique UPC identifier can be determined. To assist us in our efforts, the team found several pre-existing Java libraries designed to process barcode images. After trying several options, we settled on ZXing hosted by Google Code.
Although ZXing is fairly effective in its analysis, we ran into some trouble detecting UPC numbers in images with poor lighting. To address this issue, we built several image processing routines to sharpen images and increase contrast. Once the image is adequately formatted, our web server uses ZXing to pull out the image’s unique UPC. This number, in turn, is cross-referenced against consumer databases to obtain product information.
Product Information Lookup
Given a unique UPC, the application looks up product data by searching pre-exisiting UPC databases online. The database has an xml-rpc api that allows us to pass the databases the user’s UPC and store the associated data in text format.
Although we found several UPC databases online, we decided to use UPCDatabase.com to generate data for our proof of concept. In future versions, multiple databases could be used to cross-reference the wealth of product data available online.
Converting Product Information Text to Speech
Once product information has been accessed via the unique UPC, the next task is to convert the the data to something audible. To do this, the text has to be processed by a Text To Speech (TTS) engine.
Exisitng TTS technologies vary significantly in terms of quality and pricing. Some proprietary technologies, like AT&T’s Natural Voice package, offer
extremely realistic sound, but are often financially inacessible to the open source community. Luckily, there are several open source projects that offer solid TTS synthesis – free of charge.
For the Barcode Reader project, we opted to go with Java-based, FreeTTS.
The FreeTTS synthesizer offers three voices natively, with support for other open source TTS voices, like FestVox and CMU_ARCTIC. The library also offers developers the ability to create, play, and store audio files generated from text.
The TTS module of our code accepts product data, as text, from the UPC database. The text is then processed and synthesized with a 16kHz male voice by the TTS engine. FreeTTS’s SingleFileAudioPlayer allows us to save the synthesized data as a .wav file, which can then be passed back to the client to play.
As an optimization concern, we also designed the ability to compress this .wav file to .mp3 format. In test compressions, this resulted in a 90% file size reduction – a great feature when attempting to stream a large amount of product data over the network. However, J2ME does not currently support .mp3 playback, so our final demo employs the original .wav format.
Our MP3 compression is powered by Lame, an MP3 library for Java.
Send Audio File Back to Device and Play it
Sending the audio file back to the user turned out to be
one of our team’s biggest challenges. Using JSP on the web server, we needed to figure out a way of sending audio data back to the device that passed us the orginal UPC image. After looking into MIME types and having little success, we decided convert the .wav file into characters using 64-bit encoding. This helped allieviate corruption during data transfer, and allowed us to download the entire audio file back to the user’s device.
Once on the phone, the audio is played using Java’s native support for.wav audio. The user is presented with the product’s name and description – completing the normative application cycle.
1) Currently, we are using only one UPC database. It would be better if we could reference multiple databases in the future, so that more complete product information could be retrieved. 2) The database we use now contains a brief description and the size of the product, but does not include nutrition facts or pricing information. Searches for free services of this type would be valuable to the growth of this project.
1) We would like to implement more realistic voices for data playback. Voices can be designed with open source software like FestVox, or can be purchased commercially for improved user experience. 2) Complete implementation of the compressed .mp3 audio files for better performance. This objective is contingent on finding a development platform that handles this audio format.
ZXing (“Zebra Crossing”):
BaToo – Barcode Recognition Tookit: http://people.inf.ethz.ch/adelmanr/batoo/
Java Image Processing Library:
Image Processing – Grey Scale in Java:
UPC Database (Our Proof of Concept Database):
Additional UPC Databases:
See source code.
- Mobile Java