Liferay integration with Amazon S3

Liferay’s Documents and Media Library is capable of mounting several repositories at a time while presenting a unified interface to the user. By default, users can use the Liferay repository i.e. document_library. Liferay allows us to use different types of repositories i.e. CMIS Store, JCR Store, Documentation Store and Amazon Simple service storage (S3 Bucket).

Amazon’s Simple Storage Service (S3) is a cloud-based storage solution that you can use with Liferay. All you need is an account. The setup is very easy. When you sign up for the service, Amazon assigns you some unique keys (Access Key & Secret Key) as well as Bucket Name, which is to be integrated with Liferay. You can store your documents to the cloud from Liferay Application Server(s), seamlessly.

Recently, we have migrated a huge document repository from NAS used by Liferay6.2 to Amazon’s S3 storage service. Here are some insights about migration/integration.

This is high level design diagram for Liferay and S3 Bucket:

l2

 

1. Fresh Setup:

Setting up fresh instance of Liferay with S3 Bucket is very simple, it just requires properties defined as part of portal-<environment>.properties or portal-ext.properties:

dl.store.s3.access.key=xxxxxxxx
dl.store.s3.secret.key=xxxxxxxx
dl.store.s3.bucket.name=xxxxxxxx
dl.store.impl=com.liferay.portlet.documentlibrary.store.S3Store

And before you start, you have to make sure you have read and write access to the given S3 bucket. You can use CloudBerry Explorer or any such tool.

2. Data migration

When infrastructure is moving from physical environment to AWS Cloud, there are many things to be taken care of. In terms of Document Repository, “data” folder from SAN, NAS or any shared drive/mounted drive can’t be simply moved to S3 bucket by copy-pasting. Here is the process:

Pre-requisites:

  • Data from source system should be moved to relevant Liferay Server locally first.
  • S3 Bucket name should be ready with Access key and Secret Key.

Now add below property in portal-<environment>.properties or portal-ext.properties

Line #1: dl.store.impl=Define which repository you are using (Eg: In this case it would be: store.AdvancedFileSystemStore)

Line #2: dl.store.file.system.root.dir =xx/xx/xx (If you are not using document_library by default)

Line #3: dl.store.s3.access.key = xxxxxxxx

Line #4: dl.store.s3.secret.key = xxxxxxxx

Line #5: dl.store.s3.bucket.name= xxxxxxxx

Line #6: # dl.store.impl=com.liferay.portlet.documentlibrary.store.S3Store

And restart the server.

After restarting the server, Go to Control Panel > Server Administrator > Data Migration. You will find this:

l3

Note: line #6 com.liferay.portlet.documentlibrary.store.S3Store should be commented in portal-ext.properties, before going in data Migration Tab. If it is not commented, drop down will not show S3 Store for migration.

Now select the S3 Store option and click ‘Execute’ and go to server console, which should show something like below:

09:15:26,713 WARN [Liferay/document_library_pdf_processor-1] [RestStorageService: 221] Content-Length of data stream not set, will automatically determine data length in memory

Don’t panic with this warning as it doesn’t impact any migration. Depending upon size of data, it will keep running. However you can verify your files moving in S3 bucket (root folder as in company id). Until migration gets completed, Portal UI screen will show below page:

l4

After completion of successful migration you will get the following message:

Migration completed successfully in xxxxxxxx ms

Last step will be to change portal-ext.properties and uncomment line #6.

dl.store.impl=com.liferay.portlet.documentlibrary.store.S3Store

And Comment line#1 and line#2, which defines “dl.store.impl” & “. dl.store.file.system.root.dir” of some folder.

Restart your server now. It is integrated with S3 Bucket. Try adding new files and downloading migrated and newly added files, it should be seamless.