Drupal to AEM CMS Migration Approach

Milestones For DRUPAL to AEM Migration Approach:


The following steps illustrates the sequence of events in a content migration.
§   Finalization of functional specifications design
§   Finalize the current and future document state
§   Install and Configure the setup of AEM author and Publisher environment
§   Finalization of new design based on the existing and Future website structure
§   Finalization of mapping structure between existing website and New website
§   Validation of Migration functionality in Test environment and then move it to Production environments
§   Add the new content entry to newly designed website
§   Pre-migration of Final content –finalist the content freeze
§   Post Content migration, Validation in Production environments
§   Delta Migration from existing website to new website based on new structure

§   Finalize the Go Live date

    Existing Drupal system to AEM system mapping

























Fig.1 Drupal system to AEM system Mapping

Pre-Migration Activities : AS-IS Site Migration to - To-Be Site Migration

§  Below diagram describes the Pre-migration steps.
§  Existing site Audit Analysis
§  Page to Template mapping
§  Components to Page mapping
§  Web Function migration
§  Content Migration Hidden challenges
§  Finalizing final wireframe
§  Finalizing Base templates based on page structure
§  SEO-Impact Analysis
§  Complete Sitemap Analysis
§  Migration Data Identified






















Fig: 2 Pre-migration steps

Template mapping


§  Template design for Obsolete pages will be reference to AS-IS website pages
§  Template designed for To-Be based on To-Be page structure
§  Association of To-Be template






























Fig: 3 Template mapping from AS-IS website to To-Be website

Page and Template mapping from AS-IS website to –To-Be website pages


§  Page Design components identification with reference to  AS-IS Page structure
§  Create new pages based on finalized To-Be base templates
§  Association of AS-IS page to- To-Be page Template
























Fig: 4 Page and Template mapping from AS-IS website to –To-Be website     

Components and Page Mapping


§  AS-IS website component list will be mapped to the To-Be site components list
§  Validating AS-IS component mapping with the AEM OOTB component list
§  Use of OOTB/Custom component while migrating the static content and dynamic components functionality from AS-IS website to the To-Be site website
§  Create pages based on the mapped To-Be template and use custom and OOTB AEM components.
























Fig: 5 Components to Page Mapping form AS-IS website to To-Be website

AEM tree taxonomy:

§  Based on the existing taxonomy , Create taxonomy in the AEM CMS
§  Finalize the Taxonomy structure

Sitemap

§  Derive the sitemap based on sitemap from existing site audit analysis and new pages needs to be added.
§  For To-Be site map will be create based on the existing site audit map and To-Be sitemap

Cleaning Your Data


Data Cleaning includes following tasks
§  Inline Markup
§   Embedded Links
§   Embedded Server side code

Avoid SEO Impact during website migration


§  Key points needs to be considered during website migration
§  Industry best Practice SEO Guidelines will be followed for the website  migration-Meta data, keywords and tag migration from AS-IS website to To-Be site

§  During migration Minimize the traffic loss
§  During migration Minimize the ranking drops
§  During migration avoid Key rankings maintenance
§  During migration avoid Head traffic maintenance
§  During migration avoid Covering additional keywords
§  During migration avoid Eliminating non-performers
§  During migration avoid Loss of Indexed pages
§  During migration avoid 404 errors
§  During migration avoid Old site still showing in Google
§  During migration avoid Loss of page rank

Digital asset missing

§  During pre-migration Identify the asset to be migrated
§  During pre-migration Create asset taxonomy structure in AEM and upload images to AEM

Web Function

§  During migration Identify the web function if any
§  Prepare plan for the static and Dynamic component migration


Migration Cutover-Activities


§  List of Migration Activities


















Fig: 6 Migration cut over activities
Migration Cut over Process:

§  Migration from UAT to Prod environment
§  After content freeze in the existing system, for creating any new pages To-Be template will be used to create content in AS-IS website
§  During delta migration author is Drupal system should make note of content and to be template used for creating content in existing website
§  Delta migration to SIT and Delta migration from SIT to UAT , UAT to Prod environments





























Fig: 7 Migration Cutover process

Database Table Mapping from existing to AEM database.

§  Existing MySQL and MS SQL database table mapping will be done with AEM JCR repository
§  Drupal  content migration from MySQL to AEM JCR repository

SEO-tags

§  During migration from Drupal to AEM all the SEO tags mill be migrated
§  While SEO tag migration association of tags with content, dam assets will be taken care in AEM Tagging system

Existing Tag structure and entire tag list

§  Existing Tag structure will be taken care so that it will not impact on SEO.
§  If required reorganization of Tag structure will be taken care during migration

SEO Friendly URL Redirects

·      URL mapping to achieve search-engine friendly URLs can be accomplished in using the Apache Felix Web Management Console. apache sling resource resolver

Video’s

§  Uploading images from local file system to AEM system through DAM console

Converting HTML embedded content to Plain text


§  Most of the content in Drupal CMS are HTML embedded content, during migration these HTML embedded contents are converted to Plain text.
§  AEM doesn’t support embedding of HTML content inside the component, from AS-IS website all the HTML embedded content will be converted to AEM component.

Static content and dynamic content


§  Now migrating the content from Drupal to AEM, Mapping the content and content type to the AEM OOTB components.
Component from Drupal
Drupal CMS
Adobe AEM component
Only Static Content
Extract
Text component
Static content with image
Extract
Text Image Component
Video , Images and Document
Extract
Has to be stored in DAM
URL
Extract
Part of Sling resource resolver
Sling Resource resolver component allows to add/remove absolute or relative path of any URL
Taxonomy
Extract page path
Create taxonomy level page

Adobe AEM Static content-Text ootb component

§  For any static content the Text component will be used for the content migration from Drupal to Adobe AEM CMS.

RSS feeds

§  Existing RSS feeds data will be migrated from Drupal CMS to AEM JCR repository
§  AS-IS functionality will be migrated of RSS will be migrated from source to destination system

Web forms data

§  All the web forms data from source to destination  will be migrated

Migration Process

§  Page level migration through Manual process
                                                             I.        For Obsolete pages will follow the manual migration
§  Set of pages migration through Automated process

Extract Content from Source CMS

§  Extracting  content from Drupal CMS
§  Testing and verifying the accuracy of the extraction and making appropriate changes to the script or program if needed.

Import Content to Target CMS

§  Create scripts/programs to perform migration based on mapping specs
§  Run migration script to import data to the new CMS
§  Test and verify the accuracy of the import and make appropriate changes to the script/program if needed

User Access Control

§  Export the existing User Details and creating User details and assigning the CRUD privileges to the imported users  in AEM
§  Creating users and Groups details in AEM.

Create a Test Plan and Test Scripts Design Approach


Migration Activities
SIT
Environment
UAT Environment
Prod
Environment
Static Content validation
YES
YES
YES
Dynamic content Validation
YES
YES
YES
End to End  Site Navigation validation
YES
YES
YES
Branding validation
YES
YES
YES
Copyright Validation
YES
YES
YES
Validation of images, video, URL
YES
YES
YES
Validation of Meta-data, keyword , SEO tags
YES
YES
YES
User access control validation
YES
YES
YES
Database schema , Table and total records validation
YES
YES
YES
Content Version and Latest content validation
YES
YES
YES
Sitemap validation
YES
YES
YES
       Note: Yesà Validation is completed.
       Table: 1 Test Plan and Script execution sequence.
§      Post content migration few content mapping needs to verified and validated
§    Migration verification and validation report shows the status of content migrated from source to destination
§    If content creation is getting failed  to migrate in any of the instance , that failure needs to be highlighted in report
Perform Migration from SIT to UAT and Prod

Activities List
SIT Environment
UAT Environment
Production environment
§  Content and Digital assets Migration and Validation



§  Delta Migration and Validation

Author Instances
Validate the content migration
Author Instances
Validate the content migration
Author Instances
Validate the content migration



Publish instances
Publish the validated content from author to publish
Publish instances
Publish the validated content from author to publish
Publish instances
Publish the validated content from author to publish

      Table: 2 Migration from SIT to UAT and UAT to PROD

§  Make sure only latest content is migrated staging instance to production instance
§  If content is versioned prior to production migration those content details needs to noted

Content Migration Methods or Approaches

Automated Approach for  Migration


Task ID
Task Name
Description
Remarks
1
Content Type based metadata
Define a list of metadata for each content type

2
Finalized XML Structure
XML  Structure needs to be finalized with Client SME's

3
Design and Develop Migration Script
Design and Develop Migration script based on finalized XML structure from Task# 2

4
Content Populated XMLs for website to be migrated
Client to provide populated XMLs for all target websites which are used for migration

6
Dry Run of Migration Script in Test Environment
Dry run of migration script for a single product

        
         Table: 3 Approach for Automated migration

Manual Migration Approach


§  Identify the Static content
§  Identify the dynamic content
§  Identify the regular dynamic content
§  Identify the Static component
§  Identify the Dynamic component
§  This approach will be “cut and paste.

Initial Content in Production approach


§  Post the entire content migration to production instances , migration has to be tested
§  And content changes needs to frozen in the OLD CMS
§  Avoid content changes in the OLD CMS by removing access rights for the content author

§  No need of content freeze in the new CMS in the production environment while the delta content is being imported as shown below approach
o    Post content freeze- To-Be template will used for any content updates in AS-IS website – those newly added information needs to be captured in the separate spread sheet- helps to migrate into the To-Be website.
o    During Delta content migration- initially migration will happen to production author instance hence there is no need of content freeze in production – however delta migration will not impact on the live website-i.e. publish instances.
















Fig: 8 Content freeze in Production

Validate the Production Migration

§  Have validation and verification of steps followed during test migration
§  Production migration ensures that all verification and validation approach works in the production environment.

Delta Migration & Validation approach


§  To-Be Template will be used for creating content in the Drupal CMS after the content freeze in the Production.
§  A final export from the old CMS will pick up all content that has been created or modified since the main migration and import it to the new CMS in production. 
§  Once the final content migration is done, finally verify and validate the content which are migrated in the production instances.





















Fig: 9 Delta content migration and validation

Perform Migration Validation and Verification in the Test Environment Approach


§  Validate and verify the content has been migrated to the destination location i.e. AEM
§  During migration different type error can show up during this process:
§  Error can be Content Issues – Content can be missing, placed in the wrong location, or have incorrect attributes.
§  Error can be Mapping Specifications – The mapping specification may be incorrect or incomplete.
§  Error can be CMS Configuration – The migration may fail because the new CMS has not been configured correctly.

Fig10: Migration Validation Steps.

Post Website Migration Checklist


§  Make sure website pages renders properly
§  Make sure spelling and grammar on landing and Home page and detail pages
§  Use build in External link checker to verify all external links are working fine
§  Make sure all URL redirection works fine
§  Make sure all forms are in Reset mode
§  Make sure  robots.txt file is used for to avoid instance searching for the google engine
§  Make sure all Social media buttons are working and URL redirection happens to correct website page
§  Make sure 404 and 500, 502,503 custom error page are build
§  Make sure website backup is happening on hourly basis of content and digital assets (incremental backup)
§  Make sure once in a 1 month full repository backup is taken
§  Make sure internal system monitoring is happening for the Disk, CPU, Memory, I/O, network, Bandwidth
§  Make sure there is no temporary URL redirect exist in your website





















Fig11: Post website Migration checklist.

The WebSphere Application Server logging and tracing


This article basically helps to developers and administration to enable or disable different types of logging and tracing in the IBM WAS.

1.1 Logging and Tracing:

a.    We can specify where log data will be stored
b.    Choose a format for log content.
c.    Also specify a log detail level for components and groups of components.
d.    Can select an application server to enable or disable a system log for that server as show in below screen shot:


Steps to select the preference for no of items to be displayed based on authorization group level: where PreferencesàMaximum rows and Select authorization group levelàAll Roles or Administratoràclick on applyàclick on Server1
Reset button is used to reset the selected values for Preferences and authorization groups.

1.2 JVM Logs

Select logging and tracingàserver1àJVM logs


1.    Use this page to view and modify the settings for the Java virtual machine (JVM) System.out and System.err logs for a managed process.
2.    The JVM logs are created by redirecting the System.out and System.err streams of the JVM to independent log files.
3.    The System.out log is used to monitor the health of the running application server.
4.    The System.err log contains exception stack trace information that is used to perform problem analysis.
5.    One set of JVM logs exists for each application server and all of its applications.
6.    JVM logs are also created for the deployment manager and each node manager.
7.    Changes on the Configuration panel apply when the server is restarted.
8.    Changes on the Runtime panel apply immediately.
Select filename, File formatting, Log file rotation file size , time , historical log and application print statement files as shown in above and below screenshot.
Finaly click on ok to save changes to Master configuration

RUNTIME PANEL:

1.2.1 SystemOut.log file Sample output:

C:\Program Files (x86)\IBM\WebSphere\AppServer\profiles\AppSrv01\logs\server1\SystemOut.log
1.    WebSphere Platform 8.5.5.3 [BASE 8.5.5.3 cf031430.01] running with process name SINCBP89225LNode01Cell\SINCBP89225LNode01\server1 and process id 7760

2.    Host Operating System is Windows 7, version 6.1

3.    Java version = 1.6.0, Java Runtime Version = pwa6460_26sr8ifx-20140630_01 (SR8), Java Compiler = j9jit26, Java VM name = IBM J9 VM
4.    was.install.root = C:\Program Files (x86)\IBM\WebSphere\AppServer

5.    user.install.root = C:\Program Files (x86)\IBM\WebSphere\AppServer\profiles\AppSrv01

6.    Java Home = C:\Program Files (x86)\IBM\WebSphere\AppServer\java\jre



1.2.2 SystemError.log file:


C:\Program Files (x86)\IBM\WebSphere\AppServer\profiles\AppSrv01\logs\server1\SystemErr.log

1.3    Process Logs:


2.    The process logs are created by redirecting the standard out and standard error streams of a process to independent log files.
3.    Native code writes to the process logs.
4.    These logs can also contain information that relates to problems in native code or diagnostic information written by the JVM.
5.    One set of process logs is created for each application server and all of its applications.
6.    Process logs are also created for the deployment manager and each node manager.
7.    Changes on the Configuration panel apply when the server is restarted.
8.    Changes on the Runtime panel apply immediately.
General Properties:
Logging and tracing > server1>Process Logs
Change the log file name as required and click on ok to save the changes to master configuration
Runtime Panel for IBM Process Logs

1.3.1 Stdout.Log file path:

C:\Program Files (x86)\IBM\WebSphere\AppServer\profiles\AppSrv01\logs\server1\native_stdout.log
Sample Output:
[9/14/14 18:39:01:283 SGT] 00000001 ManagerAdmin  I   TRAS0018I: The trace state has changed. The new trace state is *=info.
[9/14/14 18:39:01:442 SGT] 00000001 ManagerAdmin  A   TRAS0007I: Logging to the service log is disabled
[9/14/14 18:39:01:494 SGT] 00000001 ProviderTrack I com.ibm.ffdc.osgi.ProviderTracker AddingService FFDC1007I: FFDC Provider Installed: com.ibm.ffdc.util.provider.FfdcOnDirProvider@57be793c
 
 

1.3.2 Stdout.Log file path:

C:\Program Files (x86)\IBM\WebSphere\AppServer\profiles\AppSrv01\logs\server1\native_stderr.log
Sample Output:
************ Start Display Current Environment ************
Log file started at: [9/14/14 18:39:01:529 SGT]
************* End Display Current Environment *************
JVMDUMP034I User requested Java dump using 'C:\Program Files (x86)\IBM\WebSphere\AppServer\profiles\AppSrv01\javacore.20140914.190932.8484.0001.txt' through com.ibm.jvm.Dump.JavaDump
JVMDUMP010I Java dump written to C:\Program Files (x86)\IBM\WebSphere\AppServer\profiles\AppSrv01\javacore.20140914.190932.8484.0001.txt
JVMDUMP034I User requested Heap dump using 'C:\Program Files (x86)\IBM\WebSphere\AppServer\profiles\AppSrv01\heapdump.20140914.191033.8484.0002.phd' through com.ibm.jvm.Dump.HeapDump
JVMDUMP010I Heap dump written to C:\Program Files (x86)\IBM\WebSphere\AppServer\profiles\AppSrv01\heapdump.20140914.191033.8484.0002.phd
************ Start Display Current Environment ************
Log file started at: [9/15/14 0:54:42:425 SGT]
************* End Display Current Environment *************
************ Start Display Current Environment ************
Log file started at: [9/15/14 1:49:43:516 SGT]
************* End Display Current Environment *************
************ Start Display Current Environment ************
Log file started at: [9/15/14 23:38:13:167 SGT]
************* End Display Current Environment *************
************ Start Display Current Environment ************
Log file started at: [9/16/14 0:59:34:676 SGT]
************* End Display Current Environment *************
************ Start Display Current Environment ************
Log file started at: [9/17/14 18:52:24:414 SGT]
************* End Display Current Environment *************
************ Start Display Current Environment ************
Log file started at: [9/19/14 20:18:34:798 SGT]
************* End Display Current Environment *************

 

1.4 IBM Service Logs:

1.     IBM service log, also known as the activity log.
2.    The IBM service log contains both the application server messages that are written to the System.out stream and special messages that contain extended service information that you can use to analyze problems.
3.    One service log exists for all Java virtual machines (JVMs) on a node, including all application servers and their node agent, if present.
4.    A separate activity log is created for a deployment manager in its own logs directory.
5.    The IBM Service log is maintained in a binary format.
6.     Use the Log Analyzer or Showlog tool to view the IBM service log.
Logging and tracing > server1 > IBM Service Logs > click on Ok > click on save >changes to master configuration

1.4.1 Log file Path:



Program Files (x86)\IBM\WebSphere\AppServer\profiles\AppSrv01\logs\activity.log

1.4.2 Change Log Level Details:

1.       Use log levels to control which events are processed by Java logging.
2.       Click Components to specify a log detail level for individual components, or click Groups to specify a log detail level for a predefined group of components.
3.        Click a component or group name to select a log detail level. Log detail levels are cumulative; a level near the top of the list includes all the subsequent levels

Select the component name > Message and Trace Levels >Message Levels and Trace level as show in below screenshot
Similarly for Group components:

Correlation: Enable log and trace correlation so entries that are serviced by more than one thread, process, or server will be identified as belonging to the same unit of work
Finally click on save to make changes in the master configuration

2. NCSA access and HTTP error logging


Logging and tracing > server1 > NCSA access and HTTP error logging
Use this page to configure HTTP error logs and National Center for Supercomputing Applications (NCSA) access logs. File Path: C:\Program Files (x86)\IBM\WebSphere\AppServer\profiles\AppSrv01\logs\server1\http_access.log

Enabling HTTP error log as shown in below screenshot

2.1 Log File Path:

 C:\Program Files (x86)\IBM\WebSphere\AppServer\profiles\AppSrv01\logs\server1\http_error.log
Finally click on save to make changes in the master configuration


2.2 Diagnostic trace service

Logging and tracing > server1 > Diagnostic trace service
1.    To view and modify the properties of the diagnostic trace service.
2.     Diagnostic trace provides detailed information about how the application server components run within this managed process.
3.    Changes on the Configuration panel apply when the server is restarted. Changes on the Runtime panel apply immediately.

Finally click on save to make changes in the master configuration
Runtime Panel:
C:\Program Files (x86)\IBM\WebSphere\AppServer\profiles\AppSrv01\logs\server1\trace.log

2.3 Trace Log output:



2.4 Configuration Problem:


It shows the problem that exist in the present configuration: You can select Maximum, High, Medium, low and none. And also enable cross validation.



2.5 Java dumps and cores


This is used to generate heap dumps, Java cores or system dumps for a running process. The files resulting from these operations are placed on the local file system as shown in the screenshot:
Java HeapDump file path: C:\Program Files (x86)\IBM\WebSphere\AppServer\profiles\AppSrv01\ heapdump.20140923.183711.7760.0001
Select server > click on Heap dump


Post Heap Dump generation:

Similarly for Java core/thread dump and System Dump
Java core: Core dump file path will be: C:\Program Files (x86)\IBM\WebSphere\AppServer\profiles\AppSrv01\ javacore.20140914.190932.8484.0001.txt

Types of run mode available in AEM 6.2

Statement : Types of run mode available in AEM 6.2

Solution: 
       - Go to the installation path of AEM server: crx-quickstart/conf/sling.properties file
       - Open the sling.properties file at the line#37 you can see the below ones:

sling.run.mode.install.options=author,publish|crx3|crx3tar,crx3mongo,crx3rdb,crx3h2,crx3segment|samplecontent,nosamplecontent

Choose one of the run mode based on the need to start the instance:

Run mode-1 : Author,crx3 & Sample content
       - sling.run.mode.install.options=author|crx3|samplecontent
Solution:
       - start your instance through cmd prompt.
For example in windows:
D:\< path of AEM jar file>\java -jar aem-author-p4502.jar -r crx3,samplecontent

Run mode-2 : Author,crx3 & No Sample content

        - sling.run.mode.install.options=author|crx3|nosamplecontent
Solution :
       - start your instance through cmd prompt.
For example in windows:
D:\\java -jar aem-author-p4502.jar -r crx3,nosamplecontent

Run mode-3: Author,crx3tar &Sample content
        - sling.run.mode.install.options=author|crx3tar|samplecontent

Solution: 
        - Start your instance through cmd prompt.
For example in windows:
D:\\java -jar aem-author-p4502.jar -r crx3tar,samplecontent

Run mode-4: Author,crx3tar &noSample content
        - sling.run.mode.install.options=author|crx3tar|nosamplecontent

Similarly follow for different data storage options(Ex: crx3tar, crx3he,crx3rdb,crx3mongo..etc)
Legend:
RDB= relational data base
H2 = Java SQL database