Configuring Hadoop Group Mappings for LDAP/Active Directory

To recognize the Active Directory user and group so that LDAP/Active Directory group level authorization can be enforced in Hadoop, set up Hadoop group mapping using one of the following options:

Configuring Hadoop Group Mapping in core-site.xml

  1. Access Apache Ambari.
  2. From the side toolbar, under Services click HDFS.
  3. Click Configs > Advanced.
  4. Add the following keypairs to Custom core-site:
    "hadoop.security.group.mapping":"org.apache.hadoop.security.CompositeGroupsMapping",
    "hadoop.security.group.mapping.providers.combined":"true",
    "hadoop.security.group.mapping.providers":"jniUnix,adServer",
    "hadoop.security.group.mapping.provider.jniUnix":"org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback",
    "hadoop.security.group.mapping.provider.adServer":"org.apache.hadoop.security.LdapGroupsMapping",
    "hadoop.security.group.mapping.provider.adServer.ldap.url":"ldap://{AD_FQDN}:389",
    "hadoop.security.group.mapping.provider.adServer.ldap.base":"{AD_SEARCH_BASE}",
    "hadoop.security.group.mapping.provider.adServer.ldap.bind.user":"{AD_BIND_USER_DN}",
    "hadoop.security.group.mapping.provider.adServer.ldap.bind.password":"{AD_BIND_USER_PWD}",
    "hadoop.security.group.mapping.provider.adServer.ldap.search.filter.group":"(&(objectclass=group)(cn=*))",
    "hadoop.security.group.mapping.provider.adServer.ldap.search.filter.user":"(&(objectclass=user)(sAMAccountName={0}))",
    "hadoop.security.group.mapping.provider.adServer.ldap.search.attr.memberof":"memberOf",
    "hadoop.security.group.mapping.provider.adServer.ldap.search.attr.group.name":"cn",
    "hadoop.security.group.mapping.ldap.ssl":"false",
    "hadoop.security.group.mapping.provider.adServer.ldap.ssl.truststore":"",
    "hadoop.security.group.mapping.provider.adServer.ldap.ssl.truststore.password.file":"",
    

To set up the mappings all at one time, run the following on un0:

  1. On the un0, run the following:
    export AMBARI_CONF_SCRIPT=/var/lib/ambari-server/resources/scripts/configs.py
  2. Collect the <ambari_password>, <un0_node_IP>, and <cluster_name> (check your cluster name from Ambari UI), and then run the following to get the current core-site configuration.
    python ${AMBARI_CONF_SCRIPT} \
    --user=admin \
    --password='<ambari_password>' \
    --protocol=https --unsafe \
    --host=<un0_node_IP> \
    --port=7183 \
    --cluster=<cluster_name> \
    --config-type=core-site \
    --action=get \
    --file=/tmp/current_core-site.json
  3. Paste the key-value pairs with the appropriate values into the properties object key-value series of your config JSON file current_core-site.json, and then add the following key-value pair to the password object of properties_attributes object.
    "password": {
       "hadoop.security.group.mapping.provider.ad4usersX.ldap.bind.password": "true" //the key-value pair to mask out your password in Ambari.
     },
  4. Run the following to update the configuration:
    python ${AMBARI_CONF_SCRIPT} \
    --user=admin \
    --password='<ambari_password>' \
    --protocol=https --unsafe \
    --host=<un0_node_IP> \
    --port=7183 \
    --cluster=<cluster_name> \
    --config-type=core-site \
    --action=set \
    --file=/tmp/current_core-site.json
  5. By default, Hadoop refreshes the user-group mapping cache every 300 seconds. To have a smaller refresh interval, add the following key-value pair:
    "hadoop.security.groups.cache.secs":"<number of seconds you need>"
  6. Click the Services '...' icon, and then click Restart all required.

Configuring Hadoop Group Mapping Using SSSD (Recommended)

  1. Enable yum packages installation.
    sudo sed -i "s/enabled=0/enabled=1/g" /etc/yum.repos.d/oracle-linux-ol7.repo
  2. Broadcast this change to all nodes.
    sudo dcli -f /etc/yum.repos.d/oracle-linux-ol7.repo -d /etc/yum.repos.d/oracle-linux-ol7.repo
  3. Install required package.
    sudo dcli -C "yum -y install sssd realmd oddjob oddjob-mkhomedir adcli krb5-workstation samba-common-tools openldap-clients"
  4. Run the realm command with the discover parameter to return information about the domain you joined and validate that the join happened.
    sudo realm join -v -U Administrator@<AD_REALM_NAME> <AD_REALM_NAME> --membership-software=adcli
    Note

    Realm join uses samba/adcli to perform realm join operation. To enforce any of them, add --membership-software=<samba/adcli> to specify the membership software you want to use.

    To register a different computer name into the AD server, add --computer-name=<computer-name>.

    To perform realm join on all the nodes of your cluster, run:

    cat /etc/hosts | grep -v 'local' | awk '{print $NF}' | while read line; do echo $line; ssh -n $line "sudo echo '<BIND_USER_PASSWORD>' | sudo realm join -v -U Administrator@<AD_REALM_NAME> <AD_REALM_NAME> --membership-software=adcli"; sleep 1; done

  5. Update /etc/sssd/sssd.conf. In the [domain/<AD_REALM_NAME>] section, make the following updates.
    1. Change fully_qualified_names to False.

      sudo sed -i 's/use_fully_qualified_names = True/use_fully_qualified_names = False/g' /etc/sssd/sssd.conf

    2. (Optional) Add the following.
      ldap_user_search_base = cn=Users,dc=ad,dc=domain,dc=com
      ldap_group_search_base = cn=
      Users,dc=ad,dc=domain,dc=com
      Note

      For each segment in the AD domain name, you must have a separate dc entry. For example, if the AD domain name is A.B.C.D.COM, you must include this dc entry list: dc=a,dc=b,dc=c,dc=d,dc=com,
  6. Broadcast the sssd.conf file to all nodes.
    sudo dcli -f /etc/sssd/sssd.conf -d /etc/sssd/sssd.conf
  7. Restart the SSSD service.
    sudo dcli -C "/bin/systemctl restart sssd.service"
  8. Verify SSSD is getting the user and group information.
    sudo dcli -C "id Administrator"
    Note

    If you get a response id: Administrator: No such user, then SSSD isn't working correctly.