About Sufia (for Managers)

Questions and answers

How does Sufia relate to Hydra?
Hydra provides a framework for building web applications on top of a repository back-end and a search index. Sufia uses the full power of Hydra and extends it to provide common repository features such as file and folder upload, metadata assignment, flexible user and group access controls, derivative generation, faceted and full-text search, delivery/download, and a host of social experience-related features. It offers self-deposit and proxy deposit workflows and will be extended in 2016 to provide one or more mediated deposit workflows. Sufia delivers its rich and growing set of features via a modern, responsive user interface, which its users have come to expect from web applications.
How does Sufia relate to Fedora 4?
All versions of Sufia require the Fedora repository software. Sufia versions 6 and 7 work atop Fedora 4, though there are active instances of Sufia (versions 3-5) in production using Fedora 3. Sufia provides a user-friendly web application for depositing and retrieving items from a Fedora repository. Sufia is built on Hydra which is a multi-tenant architecture, therefore more than one Sufia-based or other Hydra-based application can use the same Fedora instance.
How does Sufia relate to Avalon / Curate / Worthwhile / CurationConcerns / Hydra-in-a-Box?
  • Avalon: The Avalon Media System is related to Sufia only in that they both build atop Hydra and are the most widely used solution bundles.
  • Curate: The Curate gem (based on Fedora 3) uses part of Sufia, namely its models and its background jobs. The last release of Curate was made in March 2014 and it depends on a version of Sufia released in November 2013. Curate was forked as the Worthwhile gem.
  • Worthwhile: The Worthwhile gem, as a continuation of Curate, uses the same parts of Sufia as Curate did and adds in Fedora 4 support. The last release of Worthwhile was in October 2014 and it depends on a version of Sufia released in January 2015. Worthwhile was forked and continues to be developed as CurationConcerns.
  • CurationConcerns: The CurationConcerns gem builds on Worthwhile and adds in support for Portland Common Data Model (PCDM) based repository objects. CurationConcerns is actively developed and is depended upon by Sufia 7.
  • Hydra-in-a-Box: Hydra-in-a-Box is being built using Sufia 7, and is in active development, taking what Sufia provides and adding in the configurability and robustness needed to make it a turnkey product.
Does Sufia use the Portland Common Data Model (PCDM)?
Sufia 7, the first beta of which was released in June 2016, supports the Portland Common Data Model.
What do I do when I want to customize (or extend) Sufia?
First, check the Sufia GitHub page to see if the aspect of the system you would like to customize has documentation, such as this section on customizing metadata (which is perhaps the most common customization). A good place to go next is the archives is the Hydra-Tech community discussion list. This will give you a sense of whether anyone else has already done work customizing the same thing, which could suggest one or more persons to consult with or existing documentation. Please do feel free to get in touch with the Hydra-Tech list, or Sufia’s product owner (Mike Giarlo), about your customization and how that might fit into the Sufia development roadmap as collaborative work.
I don’t have developers, but are there other ways I can contribute to Sufia development?
Absolutely. One way is by contributing use cases and user stories to inform future Sufia development. If you’re likely to be managing a Sufia-based repository at some point, then there are working groups and interest groups in the Hydra community that you may wish to join, in order to get a sense of the activities taking place in the community; to see what fellow Hydra institutions are involved and what they’re focusing on; and to learn who your service manager peers are as well.
Who provides technical support for Sufia?
Because Sufia and Hydra are open and community supported, there is no centralized or official technical support entity. One of the best ways to get technical support for Sufia is to join the Hydra-Tech list and post questions to the forum. The Hydra community also holds a weekly technical call where all are free to pose questions. Last, there are currently a handful of independent consultancies that specialize in assisting users of Hydra products. At this time, the best way to find these consultancies is to contact members of the Hydra community through one of the Hydra message forums.
What kinds of metadata are supported?
By default, Sufia provides relatively simple, Dublin Core-based RDF metadata. The Sufia GitHub wiki provides some technical documentation for customization per individual needs. Please note that customization of metadata, while common, will require in-house maintenance over time as new versions of Sufia are released.
How often are there new releases?
New Sufia releases come out approximately every 6-8 weeks, with one major version upgrade every year on average.
How can I keep my Sufia application updated? How often should I plan to upgrade it?
We have found that devoting one day per month (again, on average) to updating Sufia and other components, and then testing that work, to be an effective way to stay up to date.
What components/gems are there in a complete “stack?”
Hydra itself requires a few very important pieces of software: Apache Solr, Blacklight, the Fedora Commons repository, and a SQL database (most often MySQL or PostgreSQL). In addition, Sufia requires the Redis datastore, ImageMagick, and FITS software. In terms of Ruby gems that are dependencies of Sufia, there are many, including hydra-head, active-fedora, hydra-collections, hydra-derivatives, hydra-editor, resque, resque-pool, google-api-client, and browse-everything.
How do I stay updated on the latest Sufia developments?
There are many ways to track Sufia development depending on what you’re interested in. To track development in-depth, you can follow along on GitHub. All Sufia development, technical documentation, and issue tracking is done on GitHub. To track it at a higher level (new releases, opportunities for collaboration, usage surveys), subscribe to the hydra-tech and hydra-community lists and keep an eye out for updates specifically about Sufia. Occasionally Sufia-related development, as one of the most active and widely contributed-to Hydra projects, is discussed on the weekly Hydra Tech call.
What preservation activities does Sufia do automatically?
Every file that is uploaded via Sufia is “characterized” by the File Information Tool Set (FITS) tool. FITS extracts salient technical characteristics of each file, and that information is stored in Fedora as technical metadata and is searchable. Gathering this metadata can help drive future digital preservation strategies such as emulation, normalization, and format migration. Sufia uses Fedora 4’s fixity service to record and periodically evaluate file integrity for every file uploaded via Sufia. Sufia works with Fedora 4 to create and maintain a log of significant fixity events over time, which can aid in auditing changes to an object.
Does Sufia provide use and user stats?
Yes, and this is an area that will likely have further development in 2016. For now it provides some basic stats around use and users, such as the following: Number of objects deposited; Number of users; Number of downloads for individual files; Number of pageviews for individual files; Visualizations for downloads and page views for each file - these visualizations are viewable (via “Analytics” link) on open access files to anyone. Sufia also makes certain use/user stats available only to admin users: Determining various statistics (number of users, number of files) within date ranges (i.e., number of new users between September 2014 and September 2015); Number of total files in system (with breakdown by access controls); Number of total users, who they are, number of files they’ve deposited; Top five users (users with most files deposited); and top file formats.
How many people are developing Sufia? And is the Sufia development community “right” for my developers?
There are approximately 30 institutions actively developing Sufia, some more actively than others. On top of these, the larger Hydra development community contributes functionality to the core Hydra framework which Sufia extends. Sufia has so far had code contributions from 38 different developers (as of Nov. 2015). As to whether the community is “right” - from a skill-set perspective it will be helpful for your developers/administrators to have experience (or be willing to gain experience) with Ruby on Rails, Solr, Blacklight, and Fedora Commons. The developer community for Hydra is very supportive of new users and there are active communication channels for developers seeking assistance and conversation around Hydra and Sufia issues. The Hydra community is committed to providing a welcoming and inclusive experience for all involved parties.
What user roles (depositor, curator, administrator) exist off the shelf?
  • Unauthenticated user (public)
  • Authenticated user/Depositor
  • Administrator (for access to certain use/user stats and for editor access to certain pages where content is dynamic)
  • A number of Sufia adopters have used the hydra-role-management gem for adding extra roles to Sufia.
When can I expect X to be implemented?
That depends on a number of factors, including what functionality you are interested in, who may already be working on it (or have done related work in codebases other than Sufia), and who is interested in doing this work when. The Hydra community works together to advance collective interests, and this collaborative model has been leveraged considerably in Sufia. The open community model speaks to the need for folks desiring new functionality to communicate their needs with the Hydra community, to align timelines and user stories, and, where possible, to allot time and resources to collaborate. Get in touch early and often, either with the hydra-tech or hydra-community lists or with Sufia’s product owner (Mike Giarlo). We have made a conscious decision not to maintain and commit to a long-term roadmap for Sufia (or even for Hydra), as this provides agility and the ability to take advantage of opportunities to collaborate that sometimes can not be known in advance.
What other products/services does Sufia integrate with out-of-the-box?
  • Uses the browse_everything gem to allow users to upload files via cloud providers such as Dropbox, Box, Google Drive, and SkyDrive
  • Exports metadata to Zotero, Mendeley, and EndNote
  • Integrates with Zotero’s new “My Publications” feature for automated upload and management of Zotero-managed content into Sufia
  • Leverages Google Analytics for usage statistics
  • Sufia does not yet provide ORCID integration, but there is an orcid gem that is intended to be implemented atop Sufia in your application and there are Sufia installations that have already done this and may be consulted.
  • Sufia does not yet provide DOI integration, but there is a hydra-remote-identifier gem that is intended to be implemented atop Sufia in your application and there are Sufia installations that have already done this and may be consulted.
  • Sufia does not provide institutional Single Sign-On integration since there are many possible SSO products in use. There are a number of Sufia installations which do integrate with SSO and it’s often possible to adopt other institutions’ recipes to do this customization.
Does it work with DPN / APTrust / DuraCloud / DPLA / MetaArchive, etc.?
There are many such services in this space, and Sufia does not provide integration with any out of the box. There are some Sufia installations which do integrate with some of the above services, which you may consult for hints and tips.
We're ready to test/move forward - what do we need to do to get started?
Great to hear it! There are numerous ways to plug in. First, be sure to join the hydra-tech and hydra-community lists to stay aware of the latest developments and to share your progress. Attend the annual Hydra Connect event, which is the one event that all Hydra community members endeavor to attend to connect with fellow Hydra adopters. If your developers and devops have not done any Hydra-specific training, have them go through the Dive into Hydra tutorial and keep an eye out for upcoming Hydracamp training sessions. Hydracamp is a cornerstone of the Hydra ramping-up process. Much of the work done in Hydra, and now in Sufia, is done in Interest Groups and Working Groups, so review the list of IGs/WGs and consider joining an existing group and starting your own. And last but not least do feel free to drop a line to Sufia’s product owner (Mike Giarlo) who can help orient you and get you whatever information or context would be helpful in planning and decision making.
How does Sufia handle many/large files?
Sufia comes preconfigured to limit the number of files that can be uploaded at one time to 100, and each file is allowed to be up to 500MB. There is an open issue to revisit these numbers and allow them to be more easily configurable -- if this is a need of yours, we encourage you to get in touch especially if you have development cycles to work on this. Beyond these limits, we suggest using Sufia's integration with the browse-everything gem to upload large files asynchronously via a third-party storage provider such as Dropbox or Google Drive. Another option that browse-everything provides, though this requires some small amount of work to set up, is to allow server-based uploads, where you open up space on your server for users to drop files into, and configure Sufia to point at your server space.

Care to contribute to the documentation?

Get involved!