AEM 6.1_How to enable Lucene index configuration


Recommendation

Open /system/console/configMgr/org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexProviderService and
  • Enable CopyOnRead
  • Enable CopyOnWrite
  • Enable Prefetch Index Files

See http://jackrabbit.apache.org/oak/docs/query/lucene.html for more information about the available parameters
  • As some paths don't need to be indexed, do the following: with CRXDE Lite, 
  • go to /oak:index/lucene, set a multivalue string property (String[]) named “excludedPaths" with these values /var, /etc/workflow/instances, /etc/replication.

Lucene Index Configuration related findings

Finding ID
Title
Total Risk
Effort to fix
JVM1
enable CopyOnRead
Medium
Low
JVM2
enable CopyOnWrite
Medium
Low
JVM3
Dupdate.limit
Medium
Low
JVM4
enable Prefetch Index Files
Medium
Low


How to protect your AEM instances from Google searches: Robots.txt



Statement : How to protect AEM instances from Google searches.

Recommendation

Here is an example search that lists servers that have not removed Geometrixx:use this url in search engine for search :
 inurl:/content/geometrixx    
-          First and foremost, as a best practice, recommend all CQ5 author and publish servers be put behind a firewall, not publicly accessible.
-          Only your web server (dispatcher) should be in front of the firewall. If your author and publish servers are behind a firewall, there won’t be any way for Google to index them.

Solution:


ROBOTS.txt

If it is absolutely necessary for author or publish server to be in front of a firewall, we should add a robots.txt file to the root directory /.
-          This file will prevent most search engines from displaying your server in search results.

Here are the steps for doing this:

-          Navigate to CRXDelight at {server}/crx/de/ (Make sure you’re logged in as admin)
-          Right click on your root node, and go to Create … > Create File …



                                
1.       Name the file robots.txt
2.       Place the following code in the file, and save it:
1.       User-agent: *
2.       Disallow: /
3.       Now we have to grant the anonymous user read access to the file. To do this, navigate to the user admin section at {server}/useradmin(loclhost:4502/useradmin)
4.       Open the anonymous user, and click on the permissions tab
5.       Grant read access to the robots.txt file, then click save
-          Verify the robots.txt file exists and is accessible by first logging out, then navigating to {server}/robots.txt (localhost:4502/robots.txt)
-          If it’s there, search engines should no longer index your server
-          Repeat these actions for all author/publish servers that are publicly accessible.

Robots.txt related findings

Finding ID
Name
Total risk
Effort to Fix
RB1
Enable robots.txt in prod author and Publishers
HIGH                                                                            
Medium

AEM6.1_Adobe Experience Manager Crashes during large asset uploads




Statement :AEM Crashes during large asset uploads

Solution:
The default maximum cache size for CQBufferedImageCache is set to a quarter of the JVM heap size.  
To illustrate the problem, let's say you have a system with a max heap (-Xmx param) of 5 GB, an Oak BlobCache set at 1 GB, and Document cache set at 2 GB.
 In this case, the buffered cache would take max 1.25 GB and that would leave only 0.75-GB memory for unexpected spikes.
Eventually, the JVM fails with OutOfMemoryErrors. To solve the problem, reduce the configured max size of the buffered image cache.
When uploading large amounts of assets to Adobe Experience Manager, tune the buffered cache size by configuring it via the OSGi Web Console.
  1. Go to http://host:port/system/console/configMgr/com.day.cq.dam.core.impl.cache.CQBufferedImageCache
  2. Set the property cq.dam.image.cache.max.memory in bytes for example, 1073741824 is 1 GB (1024*1024*1024 = 1 GB).