arpa Digitation Concept


arpa Digitization concept for books and documents

The digitized books that are freely available, are often only scanned or provided with a rudimentary OCR. See link.

The arpa Digitization concept
  • The by the OCR recognized text is indexed, the structure is word segmented and depply broken down. Information like e.g. person, name, profession, location, date range ... can be searched in correlation and relationship.
  • The results are shown in the image of the original. The text is used to search as ell as to the export to formats such as XML, RTF, TXT. By clicking on the page ancient writings and manuscripts can be read in the Latin alphabet and exported, helpful for the visually impaired, they can be transferred to suitable systems for them.
  • With little effort as much information as possible is developed.
  • The authoring system developed by ARPA allows you to open up documents in their depth; simultaneously cross-links are created. It is ideal for scientific work, protocols (parliament), trustees, attorneys, courts, as well as in interactive education. One search covers a multitude of documents.

The example of the "Bullinger Briefe": give you an overview of the 15 reclaimed books. You can ask to whom he has written or 'Abendmahl (Last Supper)': see the letter writers who use the term, and so many combinations of search terms can be realized with the arpa digitization concept.

The example of "BAZ" (Basler Zeitung): in the demo version you go for the word union; you can find people and places and regions. This offer opens up search ideas that are hardly found by a conventional search.

The example "ADB" (Allgemeine Deutsche Biographie): if a name in a document is searched, in the ADB further information can be found. Are you looking for 'Bach' all persons of the Bach Family enumerate in a list. Or you are looking for a doctor who is related to an Italian town. All searches can be restricted by time, place, profession etc..