Assignment
The University of California, Los Angeles (UCLA) required a robust platform capable of melding advanced document management with Optical Character Recognition (OCR) processing and multi-threaded annotations. Our goal was to architect a custom solution built to perform at scale, ensuring swift, precise document handling amidst a multi-threaded annotation environment. This initiative was aimed at elevating UCLA's document management capabilities to new heights, fostering enhanced collaborative and analytical opportunities across the institution.
Approach
Our goal was to create a user-friendly and high-performance platform that could handle potentially hundreds or thousands of students annotating the same document simultaneously while maintaining an orderly and uncluttered interface. We focused on readability and accessibility across resolutions and point sizes for the type selection and color scheme. The platform's design had to accommodate both readability and writing functions, as teachers and students would spend a significant amount of time writing within the app.
Outcome
The interface was built using a split-screen layout, with the source text on the left side of the screen and the annotations on the right. Filters and paragraph selection features were integrated to allow high levels of participation without excess visual clutter. Inline text transformation, auto-saving, and offline functionality were implemented to enhance the writing experience and improve user engagement.
Annotation activities were contained within assignments. Faculty members could add any number of tasks to each document added to an assignment, while a dashboard interface with due dates and color labeling allowed students to visually track their progress.
To address the challenge of handling hundreds of students and faculty members concurrently annotating and uploading documents, we employed a microservices architecture, allowing us to choose the best technologies and distribute the load across the app. Key technologies used included Tesseract OCR for interpreting scanned text, Elastic for search, Angular for the frontend, and an open-source EDMS document management system.
A significant challenge was developing a method for easily uploading, OCR scanning, MIME type sorting, and transforming any scanned or digital document type into semantic HTML, ready for annotating.
The platform was built using Python, Angular, HTML5, CSS, Terraform, Tesseract OCR, and other cutting-edge tools to create a seamless user experience.
The custom platform we designed for UCLA successfully integrated advanced document management with OCR processing and multi-threaded annotations, enabling faculty and students to collaborate efficiently and at scale. The platform's user-friendly design and high-performance capabilities have made it an invaluable tool for the UCLA community, fostering enhanced engagement and learning experiences.
Technologies
We used the following technologies on this project to ensure reliability and performance at every step, accounting for future platform stability and developer support, uptime and reliability ratings and a host of other factors. Read more about the technologies
Kind words
"Working with a bright, eager team, their passion for their work, and their passion for our work, was truly remarkable. Issues we thought that were intractable were solved with elegant and innovative design ideas. Integration with legacy systems proceeded much more smoothly than anyone anticipated. At every turn, their work was professional and exceptional."
Matthew Fisher
Associate Professor
UCLA