How to Improve AEM Workflow Throughput

statement : 

How to Improve AEM Workflow Throughput after change in hardware size upgrade.


step 1 tuning :  http://localhost:4502/system/console/configMgr and search for Apache Sling Job Thread Pool
PID is org.apache.sling.event.impl.EventingThreadPool.  The default value is 35.  
Raise this default value to higher number say 65.

Step 2 tuning :  navigate to  /system/console/slingevent and search for Max Parallel attribute of Job Queue
  1. Edit the [Job Queue Configuration] for [Granite Workflow Queue], 
  2. Edit the job queue [Granite Transient Workflow Queue], 
  3. [Granite Workflow External Process Job Queue] or 
  4. Edit the job queue of any other job queue of type ‘Topic Round Robin’ that is relevant to your workload.  
  5. Increase the value for ‘Max Parallel’.  The default value is usually 0.5 (half the number of vCPUs of your instance).
to find out the numbe of processor of your instace: http://localhost:4502/system/console/vmstat
based on the no of processor, default value to be changed to the half the number of processor if processor value is 4 then Max parallel value would be 2.


AEM 6.5 New features

Top features of AEM 6.5


  1. Connected DAM (Digital Asset Management) ...
  2. Adobe Asset Link Extension with AEM Assets. ...
  3. Integrate Adobe Assets with Adobe Stocks. ...
  4. Brand Portal Capabilities. ...
  5. Adobe Experience Manager SPA (Single Page Application) Editor. ...
  6. Smart Crop. ...
  7. Visual Search. ...
  8. Headless Content Delivery.
  9. Automated Forms Conversion
  10. Reusing Workflow Across Multiple Adaptive Forms
  11. Magneto and Experience cloud integration
  12. Content and Experience fragments enhancements
1. Connected DAM 
  • In a large scale enterprise, there may be two AEM instances running in parallel. 
  • A common scenario where this happens is when one instance is used as an AEM Site Author, executed by the AEM Site Team, 
  • while the second instance is used by the creative team to store assets, referred to as Assets Author instance. 
  • With a connected DAM, Site authors can search, drag & drop, save, and publish assets directly even though the asset is on a different instance. It allows site authors to access from remote AEM assets instance.
  • A Digital Asset Management (DAM) platform gives you a central hub for organizing, storing, and retrieving rich media. 
  • AEM Sites offer capabilities to create web pages and AEM Assets in the DAM system that supplies the required assets for websites.
  • AEM now provides the capability to use assets from a connected DAM which is running in a completely different AEM instance.
                                         Seamless connection
Site Author(web,mobile) <--------connected dam--------=""> DAM Author (Image,video..etc)
               Sites Team       <--------------------------------------> Creative Team


It offers unified, easy-to-use access to all owned assets, no matter where they are stored.
This is ideal for a growing number of businesses that need a central, standalone DAM with AEM Assets.

Limitations:

  • This feature is only supported on Adobe Managed Services.
  • The AEM site can connect with only one AEM Assets repository.
  • License is required for remote Asset repository.
  • There is no API support to customize the integration

Too many workflows in Inbox crash AEM due to pulse.data.json calls

Issue



The AEM author instance is slow and crashing due to pile up of failed workflows and workflow inbox badge pulse.data.json calls. The workflow inbox badge is the bell icon in the upper-right of the Touch UI.
In the access.log, it is observed that many calls to pulse.data.json are occurring from the same users. 

In addition, AEM server shows high CPU utilization and thread dumps captured from AEM threads like the one below: 
Logs:
10.43.34.55 - user 06/Jan/2017:18:17:11 +0900 "GET /mnt/overlay/granite/ui/content/shell/header/actions/pulse.data.json?_=1483664547926 HTTP/1.1" 500 1234

As per thread dumps:

org.apache.jackrabbit.oak.security.authorization.composite.CompositeAuthorizationConfiguration.getPermissionProvider(CompositeAuthorizationConfiguration.java:134)
    at org.apache.jackrabbit.oak.core.MutableRoot$1.createValue(MutableRoot.java:126)
    at org.apache.jackrabbit.oak.core.MutableRoot$1.createValue(MutableRoot.java:123)
    at org.apache.jackrabbit.oak.core.LazyValue.get(LazyValue.java:53)
    - locked <0x0000000726732148> (a org.apache.jackrabbit.oak.core.MutableRoot$1)


Environment

AEM 6.2 SP1

Resolution

Install the latest Cumulative Fix Pack to fix the bug.


I. Apply the Workaround

Since it is not possible to install a fix pack on a production AEM environment in a short timeframe, as a temporary workaround, do the following:




  1. Go to CRXDe http://aem-host:port/crx/de/index.jsp and log in as admin.
  2. Create this folder structure out of sling: Folder nodes /apps/granite/ui/components/shell/clientlibs/shell/js 
  3. Click "Save All"
  4. Browse to overlay for /libs/granite/ui/components/shell/clientlibs/shell/js/badge.js and modify the code as shown below:


    Before:
    1
    2
    3
    setInterval(function() {
    updateBadge(el, src, true);
    }, 2000);


    After (set to update every 5 minutes):
    1
    2
    3
    setInterval(function() {
    updateBadge(el, src, true);
    }, 300000);

II. Purge Old Running Workflows and Tasks

In addition to fixing the interval of the workflow notification badge, the cause for the problem is due to too many tasks pending in the user's inbox.  To address this, you have to delete workflow inbox items and tasks that are no longer needed:




  1. Go to the Workflow Maintenance JMX object:
    http://host:port/system/console/jmx/com.adobe.granite.workflow%3Atype%3DMaintenance


  2. If you don't need actively running workflows, then run the workflow purge on them by initiating purgeActive with dryRun = false.


  3. Go to http://host:port/crx/explorer/index.jsp and log in as admin.





  4. Open Content Explorer.


  5. Browse to /etc/taskmanagement/tasks.


  6. Delete tasks by right clicking the folder node and selecting Delete Recursively.


  7. Disable Preliminary Scan and run the deletion.


  8. In addition, you have more tasks under projects in /content/projects.  Use the /projects.html to remove old projects that are no longer needed.




  9. Use CRXDe to browse /content/projects subnodes and delete any tasks which are no longer required. For example: /content/projects/geometrixx/outdoors/jcr:content/dashboard/gadgets/tasks