Enterprise content management with open source tools
The Alfresco document manager supports a wide range of document management and storage standards, providing a reliable basis for storing and retrieving documents and their associated metadata. It is open-source software, making it easy to develop, customize and integrate individual components (modules).
The Alfresco document repository implements the services implemented in the CMIS and JCR standards, which include the above- mentioned:
I already have XYZ system. Can I use Alfresco with it?
Alfresco is widely used in electronic document management thanks to its excellent integration capabilities. It integrates into any company’s existing infrastructure, can be installed in Microsoft Windows and Linux environments, even using virtualization technology, as shown in the figure below.
Any storage device can be used to store the content (Local disk, Network file system, etc..), while for meta-data storage it can use common RDBMS systems (MS SQL, Oracle DB).In the context of an eDMS project, it is of utmost importance that it can communicate with existing systems. For this purpose, Alfresco’s APIs can be used, and support for the CMIS standard enables it to exchange information about documents.
Supported APIs:
Possibilities for using the CMIS standard:
In addition to storing documents, it supports full-text retrieval by indexing, allowing searching not only in metadata but also in the content of documents. The indexing is performed by Apache Solr, another open-source module, which is tightly integrated with the Alfresco document repository.
To ensure high availability, it is possible to cluster the Alfresco repositories so that in the event of any failure, business processes do not detect the outage and can continue uninterrupted. In light of this, it is possible to monitor via SNMP or JMX, so that the operations team is notified in time – or can predict – an expected failure.
The internal architecture of Alfresco is divided into 3 layers.
The Content Repository layer is responsible for managing content and related metadata and properties. The layer controls the following:
The Storage layer is responsible for storing and managing the below and, in addition, for providing queries to the Apache Lucene / Solr search service:
This layer is not closely related to the management of users and groups, as well as administration and operation. Alfresco can authenticate users from its own local directory. In large enterprise environments this capability is less used, so of course the Microsoft Active Directory and IBM Tivoli Directory currently in use can also be the source of user identification via the LDAP interface of directory systems.
The infrastructure layer is responsible for, among other things, facilitating interfacing with and porting to the various RDBMS systems via so-called SQL dialects. The supported RDBMS systems are:
The layer is also responsible for creating, managing and updating the indexes used for searching.