Hue is taking advantage of a new way to specify the frequency of a coordinator in Oozie (OOZIE-1306). Here is how to put it in practice:
The crontab requires Oozie 4. In order to use the previous Frequency drop-down from Oozie 3, the feature can be disabled in hue.ini:
[oozie] # Use Cron format for defining the frequency of a Coordinator instead of the old frequency number/unit. enable_cron_scheduling=false
Hue can authenticate with Kerberos in YARN and guarantee than someone cannot access someone else’s MapReduce information.
When you want to use the new icons in your app, you have first to import the Hue Filetypes css in your .mako template:
<link href="/static/ext/css/hue-filetypes.css" rel="stylesheet">
and then define you icons with the same way you would do with Font Awesome.
In our case you need to write a prefix (hfo instead of fa)
<i class="hfo .."></i>
and then you can specify the icon you want. To render a JSON file icon for instance you should use
<i class="hfo hfo-file-json"></i>
You can also use the modifiers from Font Awesome, so you can create a larger rotated PDF icon like this:
<i class="hfo hfo-file-json fa-2x fa-rotate-90"></i>
Hue easily integrates with your corporation’s existing identity management systems and provides authentication mechanisms for SSO providers. By changing a few configuration parameters, your employees can start doing big data analysis in their browser by leveraging an existing security policy.
This blog post details the various features and capabilities available in Hue for LDAP:
The typical authentication scheme for Hue takes of the form of the following image:
Passwords are saved into the Hue databases.
With the Hue LDAP integration, users can use their LDAP credentials to authenticate and inherit their existing groups transparently. There is no need to save or duplicate any employee password in Hue:
There are several other ways to authenticate with Hue: PAM, SPNEGO, OpenID, OAuth, SAML2, etc. This section details how Hue can authenticate against an LDAP directory server.
When authenticating via LDAP, Hue validates login credentials against a directory service if configured with this authentication backend:
[desktop] [[auth]] backend=desktop.auth.backend.LdapBackend
The LDAP authentication backend will automatically create users that don’t exist in Hue by default. Hue needs to import users in order to properly perform the authentication. The password is never imported when importing users. The following configuration can be used to disable automatic import:
[desktop] [[ldap]] create_users_on_login=false
The purpose of disabling the automatic import is to only allow to login a predefined list of manually imported users.
The case sensitivity of the authentication process is defined in the “Case sensitivity” section below.
There are two different ways to authenticate with a directory service through Hue:
The search bind mechanism for authenticating will perform an ldapsearch against the directory service and bind using the found distinguished name (DN) and password provided. This is, by default, used when authenticating with LDAP. The configurations that affect this mechanism are outlined in “LDAP search”.
The direct bind mechanism for authenticating will bind to the ldap server
using the username and password provided at login. There are two options that can be used to choose how Hue binds:
nt_domain - Domain component for User Principal Names (UPN) in active directory. This active directory specific idiom allows Hue to authenticate with active directory without having to follow LDAP references to other partitions. This typically maps to the email address of the user or the users ID in conjunction with the domain.
ldap_username_pattern - Provides a template for the DN that will ultimately be sent to the directory service when authenticating.
If ‘nt_domain’ is provided, then Hue will use a UPN to bind to the LDAP service:
[desktop] [[ldap]] nt_domain=example.com
Otherwise, the ‘ldap_username_pattern’ configuration is used (the <username> parameter will be replaced with the username provided at login):
[desktop] [[ldap]] ldap_username_pattern=”uid=,ou=People,DC=hue-search,DC=ent,DC=cloudera,DC=com”
Typical attributes to search for include:
To enable direct bind authentication, the ‘search_bind_authentication’ configuration must be set to false:
[desktop] [[ldap]] search_bind_authentication=false
If an LDAP user needs to be part of a certain group and have a particular set of permissions, then this user can be imported via the Useradmin interface:
As you can see, there are two options available when importing:
Create home directory
If ‘Create home directory’ is checked, when the user is imported their home directory in HDFS will automatically be created, if it doesn’t already exist.
If ‘Distinguished name’ is checked, then the username provided must be a full distinguished name (eg: uid=hue,ou=People,dc=gethue,dc=com). Otherwise, the Username provided should be a fragment of a Relative Distinguished Name (rDN) (e.g., the username “hue” maps to the rDN “uid=hue”). Hue will perform an LDAP search using the same methods and configurations as defined in the “LDAP search” section. Essentially, Hue will take the provided username and create a search filter using the ‘user_filter’ and ‘user_name_attr’ configurations. For more information on how Hue performs LDAP searches, see the “LDAP Search” section.
The case sensitivity of the search and import processes are defined in the “Case sensitivity” section.
Groups are importable via the Useradmin interface. Then, users can be added to this group, which would provide a set of permissions (e.g. accessing the Impala application). This function works almost the exact same way as user importing, but has a couple of extra features.
As the above image portrays, not only can groups be discovered via DN and rDN search, but users that are members of the group and members of the group’s subordinate groups can be imported as well. Posix groups and members are automatically imported if the group found has the object class ”posixGroup”.
Users and groups can be synchronized with the directory service via the Useradmin interface or via a command line utility. The images from the previous sections use the words “Sync” to indicate that when a name of a user or group that exists in Hue is being added, it will in fact be synchronized instead. In the case of importing users for a particular group, new users will be imported and existing users will be synchronized. Note: Users that have been deleted from the directory service will not be deleted from Hue. Those users can be manually deactivated from Hue via the Useradmin interface.
Currently, only the first name, last name, and email address are synchronized. Hue looks for the LDAP attributes ‘givenName’, ‘sn’, and ‘mail’ when synchronizing. Also, the ‘user_name_attr’ config is used to appropriately choose the username in Hue. For instance, if ‘user_name_attr’ is set to “uid”, then the “uid” returned by the directory service will be used as the username of the user in Hue.
The “Sync LDAP users/groups” button in the Useradmin interface will automatically synchronize all users and groups.
Here’s a quick example of how to use the command line interface to synchronize users and groups:
<hue root>/build/env/bin/hue sync_ldap_users_and_groups
There are two configurations for restricting the search process:
user_filter - General LDAP filter to restrict the search.
user_name_attr - Which attribute will be considered the username to search against.
Here is an example configuration:
[desktop] [[ldap]] [[[users]]] user_filter=”objectClass=*” user_name_attr=uid
With the above configuration, the LDAP search filter will take on the form:
(&(objectClass=*)(uid=<user entered usename>))
Hue can be configured to ignore the case of usernames as well as force usernames to lower case via the ‘ignore_username_case’ and ‘force_username_lowercase’ configurations. These two configurations are recommended to be used in conjunction with each other. This is useful when integrating with a directory service containing usernames in capital letters and unix usernames in lowercase letters (which is a Hadoop requirement). Here is an example of configuring them:
[desktop] [[ldap]] ignore_username_case=true force_username_lowercase=true
Secure communication with LDAP is provided via the SSL/TLS and StartTLS protocols. It allows Hue to validate the directory service it’s going to converse with. Practically speaking, if a Certificate Authority Certificate file is provided, Hue will communicate via LDAPS:
[desktop] [[ldap]] ldap_cert=/etc/hue/ca.crt
The StartTLS protocol can be used as well (step up to SSL/TLS):
[desktop] [[ldap]] use_start_tls=true
The Hue team is working hard on improving security. Upcoming LDAP features include: Import nested LDAP groups and multidomain support for Active Directory. We hope this brief overview of LDAP in Hue will help you make your system more secure, more compliant with current security standards, and open up big data analysis to many more users!
First, backup the database. By default this is this SqlLite file:
cp /var/lib/hue/desktop.db ~/
Then if using CM, export this variable in order to point to the correct database:
HUE_CONF_DIR=/var/run/cloudera-scm-agent/process/-hue-HUE_SERVER export HUE_CONF_DIR
Where <id> is the most recent ID in that process directory for hue-HUE_SERVER.
Then go in the Database. From the Hue root (/use/lib/hue by default):
root@hue:hue# build/env/bin/hue dbshell
And you can start typing SQL queries:
sqlite> .tables auth_group oozie_dataset auth_group_permissions oozie_decision auth_permission oozie_decisionend auth_user oozie_distcp auth_user_groups oozie_email auth_user_user_permissions oozie_end beeswax_metainstall oozie_fork beeswax_queryhistory oozie_fs beeswax_savedquery oozie_generic beeswax_session oozie_history desktop_document oozie_hive desktop_document_tags oozie_java desktop_documentpermission oozie_job desktop_documentpermission_groups oozie_join desktop_documentpermission_users oozie_kill desktop_documenttag oozie_link desktop_settings oozie_mapreduce desktop_userpreferences oozie_node django_admin_log oozie_pig django_content_type oozie_shell django_openid_auth_association oozie_sqoop django_openid_auth_nonce oozie_ssh django_openid_auth_useropenid oozie_start django_session oozie_streaming django_site oozie_subworkflow jobsub_checkforsetup oozie_workflow jobsub_jobdesign pig_document jobsub_jobhistory pig_pigscript jobsub_oozieaction search_collection jobsub_ooziedesign search_facet jobsub_ooziejavaaction search_result jobsub_ooziemapreduceaction search_sorting jobsub_ooziestreamingaction south_migrationhistory oozie_bundle useradmin_grouppermission oozie_bundledcoordinator useradmin_huepermission oozie_coordinator useradmin_ldapgroup oozie_datainput useradmin_userprofile oozie_dataoutput
Or migrating the database manually:
build/env/bin/hue syncdb build/env/bin/hue migrate
If you want to switch to another database (we recommend MySql), this guide details the migration process.
The database settings in Hue are located in the hue.ini.
In the Hue versions before 3, Hue is sometimes getting slow and “stuck”. To fix this problem, it is recommended to switch Hue to use the CherryPy server instead of Spawning.
In the hue.ini or the Hue Safety Valve in CM, enter:
[desktop] use_cherrypy_server = true
First, it is a bit simpler to configure Hue with MR2 than in MR1 as Hue does not need to use the Job Tracker plugin since Yarn provides a REST API. Yarn is also going to provide an equivalent of Job Tracker HA with YARN-149.
Here is how to configure the clusters in hue.ini. Mainly, if you are using a pseudo distributed cluster it will work by default. If not, you will just need to update all the localhost to the hostnames of the Resource Manager and History Server:
[hadoop] ... # Configuration for YARN (MR2) # ------------------------------------------------------------------------ [[yarn_clusters]] [[[default]]] # Enter the host on which you are running the ResourceManager resourcemanager_host=localhost # The port where the ResourceManager IPC listens on resourcemanager_port=8032 # Whether to submit jobs to this cluster submit_to=True # URL of the ResourceManager API resourcemanager_api_url=http://localhost:8088 # URL of the ProxyServer API proxy_api_url=http://localhost:8088 # URL of the HistoryServer API history_server_api_url=http://localhost:19888 # Configuration for MapReduce (MR1) # ------------------------------------------------------------------------ [[mapred_clusters]] [[[default]]] # Whether to submit jobs to this cluster submit_to=False
And that’s it! You can now look at jobs in Job Browser, get logs and submit jobs to Yarn!
comments powered by Disqus