The Dutch newspaper De Leeuwarder Courant will celebrates 261 years of continuous publication. No other newspaper has a longer history of publication under the same masthead. Daily, 110.000 enthusiastic readers enjoy getting their news from one of The Netherlands’s most respected newspapers, either via hardcopy or the Online Edition.
X-CAGO, a technology company headquartered in Roermond, The Netherlands is the service provider to the publisher of De Leeuwarder Courant, making sure the latest edition is available on the internet as well as digitizing and making available Online almost 1 million pages of news spanning more than 260 years of publication! The first issue of the newspaper was printed in July 1752.
Today’s news is tomorrow’s history. The cultural heritage contained in the archives of De Leeuwarder Courant is considered to be one of the most important sources of historical reference in The Netherlands.
One can read about the French Revolution, Napoleon, the Titanic, Hitler, Stalin, the World Championship of Soccer, as well as obituaries and local news.
Digitizing the Archive
The initiator of this project is the Digital Archive Leeuwarder Courant Foundation, which is a cooperation effort of De Leeuwarder Courant (www.leeuwardercourant.nl) and Tresoar (www.tresoar.nl), (The Frisian Historical and Literary Center). To open the vault of the Leeuwarder Courant has been on the agenda of both partners for a long time. The advanced technology of X-CAGO (www.x-cago.com), is the key to making available the rich content of this exceptional newspaper archive.
X-CAGO’s approach, which is different from many other vendors, is to scan directly from bound books instead of digitizing from microfilm. Microfilm has numerous disadvantages – lack of color, lack of authenticity, and inferior Optical Character Recognition. The Foundation opted for ultimate quality and therefore chose X-CAGO as their preferred technology partner, considering X-CAGO’s excellent track-record. Recently X-CAGO completed another top-notch project for a Belgium client (www.historischekranten.be).
Why digitize the Archives?
By digitizing the almost one million newspaper pages, a rich source of Dutch and Frisian culture and heritage is revealed and made available to the general public via the internet. Furthermore, this is preserved for future generations. Journalists, scientists, politicians, bankers, students and housewives now have access to this information , whether it’s today’s news or an article from 260 years ago.
What makes this project unique is:
- 260 years of continuous journalism from a single, important, influential source, providing an abundance of Dutch and Frisian cultural heritage that will stimulate and impact scientific research in many disciplines. This combined effort is a unique project for further reasons;
- Volume: close to 1,000,000 pages;
- Diversity: the content contains multiple languages including Dutch, Frisian and French as well as dealing with differing word usage during the past 250 years;
- Technology: all constituent items (editorial and advertising) are accessible on an individual basis, resulting in close to 10,000,000 articles and ads;
- Search: the complete text can be reached despite variations in spelling or usage;
- Up-to-date: the archives are updated and include the most current editions;
- Quality newspaper : the Leeuwarder Courant has represented outstanding journalism for over a quarter of a millennium.
X-CAGO, headquartered in Roermond, The Netherlands is executing this project including scanning the newspaper pages, segmenting the pages into articles and ads, tagging the headlines, bylines, captions, and pictures, as well as keeping track of the reading order and publishing on the internet. The original pages are scanned directly from bound books with the most advanced video scanners available on the market today.
A mere glimpse at an old newspaper page is enough to make obvious the differeces in spelling. As well as differences in spelling, completely different words were once used with different connotations. Without updating both spelling and usage, enormous problems quickly arise in trying to find the information one is looking for. To address these issues X-CAGO developed a web based solution called “Archive Express”. As a result of this project X-CAGO is adding Frisian to its list of supported languages. “Archive Express” also has the ability to eliminate typical OCR mistakes.
To gain a first impression and get some insight into these capabilities please visit: www.dekrantvantoen.nl
By utilizing this patented state of the art X-CAGO technology, exploiting newspaper archives on the Internet becomes a reality. X-CAGO provides services to numerous publishers by placing their latest editions Online. Each single page is segmented into Editorial content and Ads. A user can search through the complete archive with full text. Because each page is segmented only those Articles and Ads which truly match the search criteria of the user are displayed. X-CAGO’s track record is impressive – we service clients on a global scale including New Zealand, Australia, South-Africa, Cyprus, Iceland, The Netherlands, Belgium, Sweden, Norway, Denmark, the UK and the USA.
Segmentation of Pages
X-CAGO’s ClipWorX software suite enables segmentation of individual articles and ads on a page level. This unique technology is patented and, more importantly, very effective in deconstructing newspaper pages. The software creates XML files for each component, extracting meta-information like header, byline, caption, picture, text in reading order and more. The word-coordinates are stored which enables highlighting.
Archive Ex Press
Archive Ex Press is the X-CAGO software for a digital archive. It is web based and designed for serving multiple users and contains millions of individual documents, both articles and ads. Special navigation functionality is supplied in order to:
- Turn pages;
- Select individual articles and ads within a page;
- Search capabilities throughout the complete archive.
Next to printed media, Archive Ex Press has excellent facilities for various compilations, – photographs, maps, engineering drawings etcetera. X-CAGO is offering services for digitizing your collections as well as placing them Online.
Please [small_button link=”#”]contact X-CAGO[/small_button] for more information.
X-CAGO’s software solutions manage growing volumes of printed media content by letting clients securely capture analog (paper or microfilm) as well as digital documents (PDF, XML) and information, organize and catalog this information and deliver knowledge on demand. In short, the software enables you to capture, store and distribute (newspaper and magazine) content in an efficient and secure way. X-CAGO’s core software modules have underlying patents for their unique segmentation algorithms that deconstructs the layout of newspaper and magzine pages into individual constituents, i.e. articles and advertisements.
Have a look at YouTube about this project. [small_button link=”#”]Click here[/small_button] (Dutch Language)