The Center for Digital Research in the Humanities has created this brief guide to best practices for digital humanities projects.
The Center for Digital Research in the Humanities (CDRH) recommends that digital research projects be based on international standards. Standards-based projects stand a greater chance of interoperating with similar sites, and are more likely to migrate successfully into new computing environments as file formats and standards change. Examples of standard practices may include use of: Extensible Markup Language (XML) for encoding content; the Text Encoding Initiative (TEI) as an application of XML for encoding textual content; adoption of non-proprietary data file formats; XQuery-based searches of XML content; and the use of open-source database and web publishing frameworks.
A common early phase of digital research may involve building a Hypertext Markup Language (HTML) prototype to serve as an illustration or proof of concept. Although HTML is widely used in web content, it may not be adequate for recording and expressing important scholarly work. It is limited in scope because it only describes formatting, not the nature of the content. Simply put, HTML is useful for displaying documents, but is not adequate for detailed description of complex material.
XML is an internationally adopted encoding standard that describes data. It consists of a tag structure that identifies specific information within a document. Unlike HTML, XML is not limited to a specific set of tags, because a single tag set would not adapt to all documents or applications. XML utilizes the concepts of elements and attributes for encoding text. XML introduced the concept of documental well-formedness. Well-formed XML documents can be validated by computer programs that identify errors. XML's strict structure and limitless extensibility make it suitable for a wide variety of humanities projects.
Another advantage of XML is that it facilitates the separation of content from design. This separation can serve as a powerful component of managing and manipulating your project. Using the Extensible Stylesheet Language for Transformations (XSLT), design changes can be applied globally without re-encoding your XML data files.
If XML is beyond the reach of a small project, the Extensible Hypertext Markup Language (XHTML) may be an appropriate interim solution. Existing HTML files can be converted to XHTML's stricter syntax. Whatever the final decision as to document type, use a doctype declaration at the top of the source.
TIFF and JPEG image file formats are advantageous in that they are not proprietary and are widely supported in many applications. Most image editing software programs have their own proprietary file formats, but they should not be used for long-term storage. Uncompressed TIFFs are excellent for storing original digital images, and JPEGs are good for displaying them in a browser.
Metadata for image files, including date of scan, type of scanning equipment used, filenames, etc., should be recorded and maintained.
It is essential to have a plan for backing up all files, with redundant backup where possible. A good approach is to have at least two backup copies stored on different media at different locations.
Web Design, Navigation, and Testing
Carefully consider your use of colors and fonts in page design. Colors and fonts should be consistent, meaningful, and easy to read. Navigational elements should be intuitive to ensure usability. Pages should be consistently organized and well named. Meaningful URIs can help with placement in search engines. And, because bad links occur even within the best projects, you may want to consider creating a custom “page not found” page that directs users to a site map.
Sites should be tested to determine if they work well in Windows, Mac OS, and Linux-based systems. Try various browsers, screen sizes and color depths. All sites also should be tested to ensure they comply with standards for people with disabilities and users of assistive technology. Consider undertaking end-user usability studies.
Dynamic Content and Multimedia
Think about opportunities to incorporate multimedia in ways that take advantage of your digital work. Database searching, dynamic user-directed displays, audio and video represent scholarly opportunities that go beyond traditional print scholarship.
Google's success and ubiquity have helped create user demand for keyword and other searches. If you do not already have a search on your project site, you should consider adding one. While phrase and Boolean searches may be helpful, the keyword search is a good place to start.
Overviews of Basic Principles
Some of these web sites are geared toward creating digital libraries, but the practices recommended are sound for digital humanities projects as well:
- Framework of Guidance for Building Good Digital Collections, 2nd ed. (Bethesda, MD: NISO Press, 2004.)
- Research Libraries Group's Guidelines and Tools
Markup and Encoding
- Extensible Markup Language (XML) 1.0 (Fourth Edition)
Other Metadata Sites
- Encoded Archival Description (EAD)
- Metadata Encoding and Transmission Standard (METS)
- Metadata Object Schema (MODS)
- Guidelines for Electronic Text Encoding and Interchange (TEI P5 - the XML Version of TEI)
Display and Searching
- W3Schools Online Web Tutorials: Browser Statistics
- W3C XML Query (XQuery)
- XSL Transformations (XSLT Version 2.0)