WebHCat (Templeton) Manual

42 documents

Apache Hive : WebHCat

This is the manual for WebHCat, previously known as Templeton. WebHCat is the REST API for HCatalog, a table and storage management layer for Hadoop. 

See the HCatalog Manual for general HCatalog documentation.

Navigation Links Next: Using WebHCat

Apache Hive : WebHCat Configure

Configuration Files

The configuration for WebHCat (Templeton) merges the normal Hadoop configuration with the WebHCat-specific variables. Because WebHCat is designed to connect services that are not normally connected, the configuration is more complex than might be desirable.

The WebHCat-specific configuration is split into two layers:

  1. webhcat-default.xml – All the configuration variables that WebHCat needs. This file sets the defaults that ship with WebHCat and should only be changed by WebHCat developers. Do not copy this file or change it to maintain local installation settings. Because webhcat-default.xml is present in the WebHCat war file, editing a local copy of it will not change the configuration.
  2. webhcat-site.xml – The (possibly empty) configuration file in which the system administrator can set variables for their Hadoop cluster. Create this file and maintain entries in it for configuration variables that require you to override default values based on your local installation.

Note

Apache Hive : WebHCat Installation

WebHCat Installed with Hive

WebHCat and HCatalog are installed with Hive, starting with Hive release 0.11.0.

If you install Hive from the binary tarball, the WebHCat server command webhcat_server.sh is in the hcatalog/sbin directory.

Hive installation is documented here.

WebHCat Installation Procedure

Note: WebHCat was originally called Templeton. For backward compatibility the name still appears in URLs, log file names, variable names, etc.

  1. Ensure that the required related installations are in place, and place required files into the Hadoop distributed cache.
  2. Download and unpack the HCatalog distribution.
  3. Set the TEMPLETON_HOME environment variable to the base of the HCatalog REST server installation. This will usually be same as HCATALOG_HOME. This is used to find the WebHCat (Templeton) configuration.
  4. Set JAVA_HOME, HADOOP_PREFIX, and HIVE_HOME environment variables.
  5. Review the configuration and update or create webhcat-site.xml as required. Ensure that site-specific component installation locations are accurate, especially the Hadoop configuration path. Configuration variables that use a filesystem path try to have reasonable defaults, but it’s always safe to specify a full and complete path.
  6. Verify that HCatalog is installed and that the hcat executable is in the PATH.
  7. Build HCatalog using the command ant jar from the top level HCatalog directory.
  8. Start the REST server with the command “hcatalog/sbin/webhcat_server.sh start” for Hive 0.11.0 releases and later, or “sbin/webhcat_server.sh start” for installations prior to HCatalog merging with Hive.
  9. Check that your local install works. Assuming that the server is running on port 50111, the following command would give output similar to that shown.
% curl -i http://localhost:50111/templeton/v1/status
HTTP/1.1 200 OK
Content-Type: application/json
Transfer-Encoding: chunked
Server: Jetty(7.6.0.v20120127)

{"status":"ok","version":"v1"}
%

Server Commands

  • Start the server: sbin/webhcat_server.sh start (HCatalog 0.5.0 and earlier – prior to Hive release 0.11.0)
    • hcatalog/sbin/webhcat_server.sh start (Hive release 0.11.0 and later)
  • Stop the server: sbin/webhcat_server.sh stop (HCatalog 0.5.0 and earlier – prior to Hive release 0.11.0)
    • hcatalog/sbin/webhcat_server.sh stop (Hive release 0.11.0 and later)
  • End-to-end build, run, test: ant e2e

Requirements

  • Ant, version 1.8 or higher
  • Hadoop, version 1.0.3 or higher
  • ZooKeeper is required if you are using the ZooKeeper storage class. (Be sure to review and update the ZooKeeper-related WebHCat configuration.)
  • HCatalog, version 0.5.0 or higher. The hcat executable must be both in the PATH and properly configured in the WebHCat configuration.
  • Permissions must be given to the user running the server. (See below.)
  • If running a secure cluster, Kerberos keys and principals must be created. (See below.)
  • Hadoop Distributed Cache. To use Hive, Pig, or Hadoop Streaming resources, see instructions below for placing the required files in the Hadoop Distributed Cache.

Hadoop Distributed Cache

The server requires some files be accessible on the Hadoop distributed cache. For example, to avoid the installation of Pig and Hive everywhere on the cluster, the server gathers a version of Pig or Hive from the Hadoop distributed cache whenever those resources are invoked. After placing the following components into HDFS please update the site configuration as required for each.

Apache Hive : WebHCat Reference

Reference: WebHCat Resources

This overview page lists all of the WebHCat resources. (DDL resources are listed here and on another overview page. For information about HCatalog DDL commands, see HCatalog DDL. For information about Hive DDL commands, see Hive Data Definition Language.)

 

CategoryResource (Type)Description
General:version (GET)Return a list of supported response types.
 status (GET)Return the WebHCat server status.
 version (GET)Return a list of supported versions and the current version.
version/hive (GET)Return the Hive version being run. (Added in Hive 0.13.0.)
version/hadoop (GET)Return the Hadoop version being run. (Added in Hive 0.13.0.)
DDLddl (POST)Perform an HCatalog DDL command.
 ddl/database (GET)List HCatalog databases.
 ddl/database/:db (GET)Describe an HCatalog database.
 ddl/database/:db (PUT)Create an HCatalog database.
 ddl/database/:db (DELETE)Delete (drop) an HCatalog database.
 ddl/database/:db/table (GET)List the tables in an HCatalog database.
 ddl/database/:db/table/:table (GET)Describe an HCatalog table.
 ddl/database/:db/table/:table (PUT)Create a new HCatalog table.
 ddl/database/:db/table/:table (POST)Rename an HCatalog table.
 ddl/database/:db/table/:table (DELETE)Delete (drop) an HCatalog table.
 ddl/database/:db/table/:existingtable/like/:newtable (PUT)Create a new HCatalog table like an existing one.
 ddl/database/:db/table/:table/partition (GET)List all partitions in an HCatalog table.
 ddl/database/:db/table/:table/partition/:partition (GET)Describe a single partition in an HCatalog table.
 ddl/database/:db/table/:table/partition/:partition (PUT)Create a partition in an HCatalog table.
 ddl/database/:db/table/:table/partition/:partition (DELETE)Delete (drop) a partition in an HCatalog table.
 ddl/database/:db/table/:table/column (GET)List the columns in an HCatalog table.
 ddl/database/:db/table/:table/column/:column (GET)Describe a single column in an HCatalog table.
 ddl/database/:db/table/:table/column/:column (PUT)Create a column in an HCatalog table.
 ddl/database/:db/table/:table/property (GET)List table properties.
 ddl/database/:db/table/:table/property/:property (GET)Return the value of a single table property.
 ddl/database/:db/table/:table/property/:property (PUT)Set a table property.
MapReducemapreduce/streaming (POST)Create and queue Hadoop streaming MapReduce jobs.
 mapreduce/jar (POST)Create and queue standard Hadoop MapReduce jobs.
Pigpig (POST)Create and queue Pig jobs.
Hivehive (POST)Run Hive queries and commands.
Queue(deprecated in Hive 0.12,removed in Hive 0.14)queue (GET)Return a list of all job IDs. (Removed in Hive 0.14.0.)
 queue/:jobid (GET)Return the status of a job given its ID. (Removed in Hive 0.14.0.)
 queue/:jobid (DELETE)Kill a job given its ID. (Removed in Hive 0.14.0.)
Jobs(Hive 0.12 and later)jobs (GET)Return a list of all job IDs.
 jobs/:jobid (GET)Return the status of a job given its ID.
 jobs/:jobid (DELETE)Kill a job given its ID.

Navigation Links Previous: Configuration
Next: GET :version

Apache Hive : WebHCat Reference AllDDL

WebHCat Reference: DDL Resources

This is an overview page for the WebHCat DDL resources. The full list of WebHCat resources is on this overview page.

ObjectResource (Type)Description
DDL Commandddl (POST)Perform an HCatalog DDL command.
Databaseddl/database (GET)List HCatalog databases.
 ddl/database/:db (GET)Describe an HCatalog database.
 ddl/database/:db (PUT)Create an HCatalog database.
 ddl/database/:db (DELETE)Delete (drop) an HCatalog database.
Tableddl/database/:db/table (GET)List the tables in an HCatalog database.
 ddl/database/:db/table/:table (GET)Describe an HCatalog table.
 ddl/database/:db/table/:table (PUT)Create a new HCatalog table.
 ddl/database/:db/table/:table (POST)Rename an HCatalog table.
 ddl/database/:db/table/:table (DELETE)Delete (drop) an HCatalog table.
 ddl/database/:db/table/:existingtable/like/:newtable (PUT)Create a new HCatalog table like an existing one.
Partitionddl/database/:db/table/:table/partition (GET)List all partitions in an HCatalog table.
 ddl/database/:db/table/:table/partition/:partition (GET)Describe a single partition in an HCatalog table.
 ddl/database/:db/table/:table/partition/:partition (PUT)Create a partition in an HCatalog table.
 ddl/database/:db/table/:table/partition/:partition (DELETE)Delete (drop) a partition in an HCatalog table.
Columnddl/database/:db/table/:table/column (GET)List the columns in an HCatalog table.
 ddl/database/:db/table/:table/column/:column (GET)Describe a single column in an HCatalog table.
 ddl/database/:db/table/:table/column/:column (PUT)Create a column in an HCatalog table.
Propertyddl/database/:db/table/:table/property (GET)List table properties.
 ddl/database/:db/table/:table/property/:property (GET)Return the value of a single table property.
 ddl/database/:db/table/:table/property/:property (PUT)Set a table property.

Navigation Links Previous: GET version Next: POST ddl

Apache Hive : WebHCat Reference DDL

Description

Performs an HCatalog DDL command. The command is executed immediately upon request. Responses are limited to 1 MB. For requests which may return longer results consider using the Hive resource as an alternative.

URL

http://www.myserver.com/templeton/ddl

Parameters

NameDescriptionRequired?Default
execThe HCatalog ddl string to executeRequiredNone
groupThe user group to use when creating a tableOptionalNone
permissionsThe permissions string to use when creating a table. The format is “rwxrw-r-x”.OptionalNone

The standard parameters are also supported.

Apache Hive : WebHCat Reference DeleteDB

Description

Delete a database.

URL

http://www.myserver.com/templeton/v1/ddl/database/:db

Parameters

NameDescriptionRequired?Default
:dbThe database nameRequiredNone
ifExistsHive returns an error if the database specified does not exist, unless ifExists is set to true.Optionalfalse
optionParameter set to either “restrict” or “cascade”. Restrict will remove the schema if all the tables are empty. Cascade removes everything including data and definitions.OptionalNone
groupThe user group to useOptionalNone
permissionsThe permissions string to use. The format is “rwxrw-r-x”.OptionalNone

The standard parameters are also supported.

Apache Hive : WebHCat Reference DeleteJob

Description

Kill a job given its job ID. Substitute “:jobid” with the job ID received when the job was created.

Version: Deprecated in 0.12.0

DELETE queue/:jobid is deprecated starting in Hive release 0.12.0. Users are encouraged to use DELETE jobs/:jobid instead. (See HIVE-4443.)
DELETE queue/:jobid is equivalent to DELETE jobs/:jobid – check [DELETE jobs/:jobid](https://hive.apache.org/docs/latest/webhcat/webhcat-reference-deletejobid/) for documentation.

Version: Obsolete in 0.14.0

DELETE queue/:jobid will be removed in Hive release 0.14.0. (See HIVE-6432.)
Use [DELETE jobs/:jobid](https://hive.apache.org/docs/latest/webhcat/webhcat-reference-deletejobid/) instead.

Apache Hive : WebHCat Reference DeleteJobID

Description

Kill a job given its job ID. Substitute “:jobid” with the job ID received when the job was created.

Version: Hive 0.12.0 and later

DELETE jobs/:jobid is introduced in Hive release 0.12.0. It is equivalent to [DELETE queue/:jobid](https://hive.apache.org/docs/latest/webhcat/webhcat-reference-deletejob/) in prior releases.
DELETE queue/:jobid is now deprecated (HIVE-4443) and will be removed in Hive 0.14.0 (HIVE-6432).

URL

http://www.myserver.com/templeton/v1/jobs/:jobid

Parameters

NameDescriptionRequired?Default
:jobidThe job ID to delete. This is the ID received when the job was created.RequiredNone

The standard parameters are also supported.

Apache Hive : WebHCat Reference DeletePartition

Description

Delete (drop) a partition in an HCatalog table.

URL

http://www.myserver.com/templeton/v1/ddl/database/:db/table/:table/partition/:partition

Parameters

NameDescriptionRequired?Default
:dbThe database nameRequiredNone
:tableThe table nameRequiredNone
:partitionThe partition name, col_name=‘value’ list. Be careful to properly encode the quote for http, for example, country=%27algeria%27.RequiredNone
ifExistsHive returns an error if the partition specified does not exist, unless ifExists is set to true.Optionalfalse
groupThe user group to useOptionalNone
permissionsThe permissions string to use. The format is “rwxrw-r-x”.OptionalNone

The standard parameters are also supported.

Apache Hive : WebHCat Reference DeleteTable

Description

Delete (drop) an HCatalog table.

URL

http://www.myserver.com/templeton/v1/ddl/database/:db/table/:table

Parameters

NameDescriptionRequired?Default
:dbThe database nameRequiredNone
:tableThe table nameRequiredNone
ifExistsHive 0.70 and later returns an error if the table specified does not exist, unless ifExists is set to true.Optionalfalse
groupThe user group to useOptionalNone
permissionsThe permissions string to use. The format is “rwxrw-r-x”.OptionalNone

The standard parameters are also supported.

Apache Hive : WebHCat Reference GetColumn

Description

Describe a single column in an HCatalog table.

URL

http://www.myserver.com/templeton/v1/ddl/database/:db/table/:table/column/:column

Parameters

NameDescriptionRequired?Default
:dbThe database nameRequiredNone
:tableThe table nameRequiredNone
:columnThe column nameRequiredNone

The standard parameters are also supported.

Results

NameDescription
databaseThe database name
tableThe table name
columnA JSON object containing the column name, type, and comment (if any)

Example

Curl Command

% curl -s 'http://localhost:50111/templeton/v1/ddl/database/default/table/test_table/column/price?user.name=ctdean'

JSON Output

{
 "database": "default",
 "table": "test_table",
 "column": {
   "name": "price",
   "comment": "The unit price",
   "type": "float"
 }
}

Navigation Links Previous: GET ddl/database/:db/table/:table/column Next: PUT ddl/database/:db/table/:table/column/:column

Apache Hive : WebHCat Reference GetColumns

Description

List the columns in an HCatalog table.

URL

http://www.myserver.com/templeton/v1/ddl/database/:db/table/:table/column

Parameters

NameDescriptionRequired?Default
:dbThe database nameRequiredNone
:tableThe table nameRequiredNone

The standard parameters are also supported.

Results

NameDescription
columnsA list of column names and types
databaseThe database name
tableThe table name

Example

Curl Command

% curl -s 'http://localhost:50111/templeton/v1/ddl/database/default/table/my_table/column?user.name=ctdean'

JSON Output

{
 "columns": [
   {
     "name": "id",
     "type": "bigint"
   },
   {
     "name": "user",
     "comment": "The user name",
     "type": "string"
   },
   {
     "name": "my_p",
     "type": "string"
   },
   {
     "name": "my_q",
     "type": "string"
   }
 ],
 "database": "default",
 "table": "my_table"
}

Navigation Links Previous: DELETE ddl/database/:db/table/:table/partition/:partition Next: GET ddl/database/:db/table/:table/column/:column

Apache Hive : WebHCat Reference GetDB

Description

Describe a database. (Note: This resource has a “format=extended” parameter however the output structure does not change if it is used.)

URL

http://www.myserver.com/templeton/v1/ddl/database/:db

Parameters

NameDescriptionRequired?Default
:dbThe database nameRequiredNone

The standard parameters are also supported.

Results

NameDescription
locationThe database location
paramsThe database parameters
commentThe database comment
databaseThe database name

Example

Curl Command

% curl -s 'http://localhost:50111/templeton/v1/ddl/database/newdb?user.name=ctdean'

JSON Output

{
 "location":"hdfs://localhost:9000/warehouse/newdb.db",
 "params":"{a=b}",
 "comment":"Hello there",
 "database":"newdb"
}

JSON Output (error)

{
  "error": "No such database: newdb",
  "errorCode": 404
}

Navigation Links Previous: GET ddl/database Next: PUT ddl/database/:db

Apache Hive : WebHCat Reference GetDBs

Description

List the databases in HCatalog.

URL

http://www.myserver.com/templeton/v1/ddl/database

Parameters

NameDescriptionRequired?Default
likeList only databases whose names match the specified pattern.Optional“*” (List all)

The standard parameters are also supported.

Results

NameDescription
databasesA list of database names.

Example

Curl Command

% curl -s 'http://localhost:50111/templeton/v1/ddl/database?user.name=ctdean&like=n*'

JSON Output

{
 "databases": [
   "newdb",
   "newdb2"
 ]
}

Navigation Links Previous: POST ddl
Next: GET ddl/database/:db

Apache Hive : WebHCat Reference GetPartition

Description

Describe a single partition in an HCatalog table.

URL

http://www.myserver.com/templeton/v1/ddl/database/:db/table/:table/partition/:partition

Parameters

NameDescriptionRequired?Default
:dbThe database nameRequiredNone
:tableThe table nameRequiredNone
:partitionThe partition name, col_name=‘value’ list. Be careful to properly encode the quote for http, for example, country=%27algeria%27.RequiredNone

The standard parameters are also supported.

Results

NameDescription
databaseThe database name
tableThe table name
partitionThe partition name
partitionedTrue if the table is partitioned
locationLocation of table
outputFormatOutput format
columnsList of column names, types, and comments
ownerThe owner’s user name
partitionColumnsList of the partition columns
inputFormatInput format

Example

Curl Command

% curl -s \
   'http://localhost:50111/templeton/v1/ddl/database/default/table/mytest/partition/country=%27US%27?user.name=ctdean'

JSON Output

{
  "partitioned": true,
  "location": "hdfs://ip-10-77-6-151.ec2.internal:8020/apps/hive/warehouse/mytest/loc1",
  "outputFormat": "org.apache.hadoop.hive.ql.io.RCFileOutputFormat",
  "columns": [
    {
      "name": "i",
      "type": "int"
    },
    {
      "name": "j",
      "type": "bigint"
    },
    {
      "name": "ip",
      "comment": "IP Address of the User",
      "type": "string"
    }
  ],
  "owner": "rachel",
  "partitionColumns": [
    {
      "name": "country",
      "type": "string"
    }
  ],
  "inputFormat": "org.apache.hadoop.hive.ql.io.RCFileInputFormat",
  "database": "default",
  "table": "mytest",
  "partition": "country='US'"
}

Navigation Links Previous: GET ddl/database/:db/table/:table/partition Next: PUT ddl/database/:db/table/:table/partition/:partition

Apache Hive : WebHCat Reference GetPartitions

Description

List all the partitions in an HCatalog table.

URL

http://www.myserver.com/templeton/v1/ddl/database/:db/table/:table/partition

Parameters

NameDescriptionRequired?Default
:dbThe database nameRequiredNone
:tableThe table nameRequiredNone

The standard parameters are also supported.

Results

NameDescription
partitionsA list of partition values and of partition names
databaseThe database name
tableThe table name

Example

Curl Command

% curl -s 'http://localhost:50111/templeton/v1/ddl/database/default/table/my_table/partition?user.name=ctdean'

JSON Output

{
  "partitions": [
    {
      "values": [
        {
          "columnName": "dt",
          "columnValue": "20120101"
        },
        {
          "columnName": "country",
          "columnValue": "US"
        }
      ],
      "name": "dt='20120101',country='US'"
    }
  ],
  "database": "default",
  "table": "my_table"
}

Navigation Links Previous: PUT ddl/database/:db/table/:existingtable/like/:newtable Next: GET ddl/database/:db/table/:table/partition/:partition

Apache Hive : WebHCat Reference GetProperties

Description

List all the properties of an HCatalog table.

URL

http://www.myserver.com/templeton/v1/ddl/database/:db/table/:table/property

Parameters

NameDescriptionRequired?Default
:dbThe database nameRequiredNone
:tableThe table nameRequiredNone

The standard parameters are also supported.

Results

NameDescription
propertiesA list of the table’s properties in name: value pairs
databaseThe database name
tableThe table name

Example

Curl Command

% curl -s 'http://localhost:50111/templeton/v1/ddl/database/default/table/test_table/property?user.name=ctdean'

JSON Output

{
 "properties": {
   "fruit": "apple",
   "last_modified_by": "ctdean",
   "hcat.osd": "org.apache.hcatalog.rcfile.RCFileOutputDriver",
   "color": "blue",
   "last_modified_time": "1331620706",
   "hcat.isd": "org.apache.hcatalog.rcfile.RCFileInputDriver",
   "transient_lastDdlTime": "1331620706",
   "comment": "Best table made today",
   "country": "Albania"
 },
 "table": "test_table",
 "database": "default"
}

Navigation Links Previous: PUT ddl/database/:db/table/:table/column/:column Next: GET ddl/database/:db/table/:table/property/:property

Apache Hive : WebHCat Reference GetProperty

Description

Return the value of a single table property.

URL

http://www.myserver.com/templeton/v1/ddl/database/:db/table/:table/property/:property

Parameters

NameDescriptionRequired?Default
:dbThe database nameRequiredNone
:tableThe table nameRequiredNone
:propertyThe property nameRequiredNone

The standard parameters are also supported.

Results

NameDescription
propertyThe requested property’s name: value pair
databaseThe database name
tableThe table name

Example

Curl Command

% curl -s 'http://localhost:50111/templeton/v1/ddl/database/default/table/test_table/property/fruit?user.name=ctdean'

JSON Output

{
 "property": {
   "fruit": "apple"
 },
 "table": "test_table",
 "database": "default"
}

JSON Output (error)

{
  "error": "Table test_table does not exist",
  "errorCode": 404,
  "database": "default",
  "table": "test_table"
}

Navigation Links Previous: GET ddl/database/:db/table/:table/property Next: PUT ddl/database/:db/table/:table/property/:property

Apache Hive : WebHCat Reference GetTable

Description

Describe an HCatalog table. Normally returns a simple list of columns (using “desc table”), but the extended format will show more information (using “show table extended like”).

URL

http://www.myserver.com/templeton/v1/ddl/database/:db/table/:table

http://www.myserver.com/templeton/v1/ddl/database/:db/table/:table?format=extended

Parameters

NameDescriptionRequired?Default
:dbThe database nameRequiredNone
:tableThe table nameRequiredNone
formatSet “format=extended” to see additional information (using “show table extended like”)OptionalNot extended

The standard parameters are also supported.

Apache Hive : WebHCat Reference GetTables

Description

List the tables in an HCatalog database.

URL

http://www.myserver.com/templeton/v1/ddl/database/:db/table

Parameters

NameDescriptionRequired?Default
:dbThe database nameRequiredNone
likeList only tables whose names match the specified patternOptional“*” (List all tables)

The standard parameters are also supported.

Results

NameDescription
tablesA list of table names
databaseThe database name

Example

Curl Command

% curl -s 'http://localhost:50111/templeton/v1/ddl/database/default/table?user.name=ctdean&like=m*'

JSON Output

{
 "tables": [
   "my_table",
   "my_table_2",
   "my_table_3"
 ],
 "database": "default"
}

JSON Output (error)

{
  "errorDetail": "
    org.apache.hadoop.hive.ql.metadata.HiveException: ERROR: The database defaultsd does not exist.
        at org.apache.hadoop.hive.ql.exec.DDLTask.switchDatabase(DDLTask.java:3122)
        at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:224)
        at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:134)
        at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
        at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1332)
        at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1123)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:931)
        at org.apache.hcatalog.cli.HCatDriver.run(HCatDriver.java:42)
        at org.apache.hcatalog.cli.HCatCli.processCmd(HCatCli.java:247)
        at org.apache.hcatalog.cli.HCatCli.processLine(HCatCli.java:203)
        at org.apache.hcatalog.cli.HCatCli.main(HCatCli.java:162)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
    ",
  "error": "FAILED: Error in metadata: ERROR: The database defaultsd does not exist.",
  "errorCode": 500,
  "database": "defaultsd"
}

Navigation Links Previous: DELETE ddl/database/:db Next: GET ddl/database/:db/table/:table

Apache Hive : WebHCat Reference Hive

Description

Runs a Hive query or set of commands.

Version: Hive 0.13.0 and later

As of Hive 0.13.0, GET version/hive displays the Hive version used for the query or commands.

URL

http://www.myserver.com/templeton/v1/hive

Parameters

NameDescriptionRequired?Default
executeString containing an entire, short Hive program to run.One of either “execute” or “file” is required.None
fileHDFS file name of a Hive program to run.One of either “execute” or “file” is required.None
defineSet a Hive configuration variable using the syntax define=NAME=VALUE. See a note CURL and “=”.OptionalNone
argSet a program argument. This parameter was introduced in Hive 0.12.0. (See HIVE-4444.)Optional in Hive 0.12.0+None
filesComma-separated files to be copied to the map reduce cluster. This parameter was introduced in Hive 0.12.0. (See HIVE-4444.)Optional in Hive 0.12.0+None
statusdirA directory where WebHCat will write the status of the Hive job. If provided, it is the caller’s responsibility to remove this directory when done.OptionalNone
enablelogIf statusdir is set and enablelog is “true”, collect Hadoop job configuration and logs into a directory named $statusdir/logs after the job finishes. Both completed and failed attempts are logged. The layout of subdirectories in $statusdir/logs is: logs/$job_id (directory for $job_id) logs/$job_id/job.xml.html logs/$job_id/$attempt_id (directory for $attempt_id) logs/$job_id/$attempt_id/stderr logs/$job_id/$attempt_id/stdout logs/$job_id/$attempt_id/syslog This parameter was introduced in Hive 0.12.0. (See HIVE-4531.)Optional in Hive 0.12.0+None
callbackDefine a URL to be called upon job completion. You may embed a specific job ID into this URL using $jobId. This tag will be replaced in the callback URL with this job’s job ID.OptionalNone

The standard parameters are also supported.

Apache Hive : WebHCat Reference Job

Description

Check the status of a job and get related job information given its job ID. Substitute “:jobid” with the job ID received when the job was created.

Version: Hive 0.12.0 and later

GET jobs/:jobid is introduced in Hive release 0.12.0. It is equivalent to [GET queue/:jobid](https://hive.apache.org/docs/latest/webhcat/webhcat-reference-jobinfo/) in prior releases.
GET queue/:jobid is now deprecated (HIVE-4443) and will be removed in Hive 0.14.0 (HIVE-6432).

Apache Hive : WebHCat Reference JobIDs

Description

Return a list of all job IDs.

Version: Deprecated in 0.12.0

GET queue is deprecated starting in Hive release 0.12.0. (See HIVE-4443.) Users are encouraged to use [GET jobs](https://hive.apache.org/docs/latest/webhcat/webhcat-reference-jobs/) instead.

Version: Obsolete in 0.14.0

GET queue will be removed in Hive release 0.14.0. (See HIVE-6432.)
Use [GET jobs](https://hive.apache.org/docs/latest/webhcat/webhcat-reference-jobs/) instead.

URL

http://www.myserver.com/templeton/v1/queue

Parameters

NameDescriptionRequired?Default
showallIf showall is set to “true”, then the request will return all jobs the user has permission to view, not only the jobs belonging to the user. This parameter is not available in releases prior to Hive 0.12.0. (See HIVE-4442.)Optional in Hive 0.12.0+false

The standard parameters are also accepted.

Apache Hive : WebHCat Reference JobInfo

Description

Check the status of a job and get related job information given its job ID. Substitute “:jobid” with the job ID received when the job was created.

Version: Deprecated in 0.12.0

GET queue/:jobid is deprecated starting in Hive release 0.12.0. Users are encouraged to use GET jobs/:jobid instead. (See HIVE-4443.)
GET queue/:jobid is equivalent to GET jobs/:jobid – check [GET jobs/:jobid](https://hive.apache.org/docs/latest/webhcat/webhcat-reference-job/) for documentation.

Version: Obsolete in 0.14.0

Apache Hive : WebHCat Reference Jobs

Description

Return a list of all job IDs.

Version: Hive 0.12.0 and later

GET jobs is introduced in Hive release 0.12.0. It is equivalent to [GET queue](https://hive.apache.org/docs/latest/webhcat/webhcat-reference-jobids/) in prior releases.
GET queue is now deprecated (HIVE-4443) and will be removed in Hive 0.14.0 (HIVE-6432).

URL

http://www.myserver.com/templeton/v1/jobs

Parameters

NameDescriptionRequired?Default
fieldsIf fields set to “”, the request will return full details of the job. If fields is missing, will only return the job ID. Currently the value can only be “”, other values are not allowed and will throw exception.OptionalNone
showallIf showall is set to “true”, the request will return all jobs the user has permission to view, not only the jobs belonging to the user.Optionalfalse
jobidIf jobid is present, only the records whose job ID is lexicographically greater than jobid are returned. For example, if jobid = “job_201312091733_0001”, the jobs whose job ID is greater than “job_201312091733_0001” are returned. The number of records returned depends on the value of numrecords.This parameter is not available in releases prior to Hive 0.13.0. (See HIVE-5519.)Optional in Hive 0.13.0+None
numrecordsIf the jobid and numrecords parameters are present, the top numrecords records appearing after jobid will be returned after sorting the job ID list lexicographically. If the jobid parameter is missing and numrecords is present, the top numrecords will be returned after lexicographically sorting the job ID list. If the jobid parameter is present and numrecords is missing, all the records whose job ID is greater than jobid are returned.This parameter is not available in releases prior to Hive 0.13.0. (See HIVE-5519.)Optional in Hive 0.13.0+All

The standard parameters are also accepted.

Apache Hive : WebHCat Reference MapReduceJar

Description

Creates and queues a standard Hadoop MapReduce job.

Version: Hive 0.13.0 and later

As of Hive 0.13.0, GET version/hadoop displays the Hadoop version used for the MapReduce job.

URL

http://www.myserver.com/templeton/v1/mapreduce/jar

Parameters

NameDescriptionRequired?Default
jarName of the jar file for Map Reduce to use.RequiredNone
className of the class for Map Reduce to use.RequiredNone
libjarsComma separated jar files to include in the classpath.OptionalNone
filesComma separated files to be copied to the map reduce cluster.OptionalNone
argSet a program argument.OptionalNone
defineSet a Hadoop configuration variable using the syntax define=NAME=VALUEOptionalNone
statusdirA directory where WebHCat will write the status of the Map Reduce job. If provided, it is the caller’s responsibility to remove this directory when done.OptionalNone
enablelogIf statusdir is set and enablelog is “true”, collect Hadoop job configuration and logs into a directory named $statusdir/logs after the job finishes. Both completed and failed attempts are logged. The layout of subdirectories in $statusdir/logs is: logs/$job_id (directory for $job_id) logs/$job_id/job.xml.html logs/$job_id/$attempt_id (directory for $attempt_id) logs/$job_id/$attempt_id/stderr logs/$job_id/$attempt_id/stdout logs/$job_id/$attempt_id/syslog This parameter was introduced in Hive 0.12.0. (See HIVE-4531.)Optional in Hive 0.12.0+None
callbackDefine a URL to be called upon job completion. You may embed a specific job ID into this URL using $jobId. This tag will be replaced in the callback URL with this job’s job ID.OptionalNone
usehcatalogSpecify that the submitted job uses HCatalog and therefore needs to access the metastore, which requires additional steps for WebHCat to perform in a secure cluster. (See HIVE-5133.) This parameter will be introduced in Hive 0.13.0. Also, if webhcat-site.xml defines the parameters templeton.hive.archive, templeton.hive.home and templeton.hcat.home then WebHCat will ship the Hive tar to the target node where the job runs. (See HIVE-5547.) This means that Hive doesn’t need to be installed on every node in the Hadoop cluster. This is independent of security, but improves manageability. The webhcat-site.xml parameters are documented in webhcat-default.xml.Optional in Hive 0.13.0+false

The standard parameters are also supported.

Apache Hive : WebHCat Reference MapReduceStream

Description

Create and queue a Hadoop streaming MapReduce job.

Version: Hive 0.13.0 and later

As of Hive 0.13.0, GET version/hadoop displays the Hadoop version used for the MapReduce job.

URL

http://www.myserver.com/templeton/v1/mapreduce/streaming

Parameters

NameDescriptionRequired?Default
inputLocation of the input data in Hadoop.RequiredNone
outputLocation in which to store the output data. If not specified, WebHCat will store the output in a location that can be discovered using the queue resource.OptionalSee description
mapperLocation of the mapper program in Hadoop.RequiredNone
reducerLocation of the reducer program in Hadoop.RequiredNone
fileAdd an HDFS file to the distributed cache.OptionalNone
defineSet a Hadoop configuration variable using the syntax define=NAME=VALUEOptionalNone
cmdenvSet an environment variable using the syntax cmdenv=NAME=VALUEOptionalNone
argSet a program argument.OptionalNone
statusdirA directory where WebHCat will write the status of the Map Reduce job. If provided, it is the caller’s responsibility to remove this directory when done.OptionalNone
enablelogIf statusdir is set and enablelog is “true”, collect Hadoop job configuration and logs into a directory named $statusdir/logs after the job finishes. Both completed and failed attempts are logged. The layout of subdirectories in $statusdir/logs is: logs/$job_id (directory for $job_id) logs/$job_id/job.xml.html logs/$job_id/$attempt_id (directory for $attempt_id) logs/$job_id/$attempt_id/stderr logs/$job_id/$attempt_id/stdout logs/$job_id/$attempt_id/syslog This parameter was introduced in Hive 0.12.0. (See HIVE-4531.)Optional in Hive 0.12.0+None
callbackDefine a URL to be called upon job completion. You may embed a specific job ID into this URL using $jobId. This tag will be replaced in the callback URL with this job’s job ID.OptionalNone

The standard parameters are also supported.

Apache Hive : WebHCat Reference Pig

Description

Create and queue a Pig job.

URL

http://www.myserver.com/templeton/v1/pig

Parameters

NameDescriptionRequired?Default
executeString containing an entire, short Pig program to run.One of either “execute” or “file” is required.None
fileHDFS file name of a Pig program to run.One of either “execute” or “file” is required.None
argSet a program argument. If -useHCatalog is included, then usehcatalog is interpreted as “true” (Hive 0.13.0 and later).OptionalNone
filesComma separated files to be copied to the map reduce cluster.OptionalNone
statusdirA directory where WebHCat will write the status of the Pig job. If provided, it is the caller’s responsibility to remove this directory when done.OptionalNone
enablelogIf statusdir is set and enablelog is “true”, collect Hadoop job configuration and logs into a directory named $statusdir/logs after the job finishes. Both completed and failed attempts are logged. The layout of subdirectories in $statusdir/logs is: logs/$job_id (directory for $job_id) logs/$job_id/job.xml.html logs/$job_id/$attempt_id (directory for $attempt_id) logs/$job_id/$attempt_id/stderr logs/$job_id/$attempt_id/stdout logs/$job_id/$attempt_id/syslog This parameter was introduced in Hive 0.12.0. (See HIVE-4531.)Optional in Hive 0.12.0+None
callbackDefine a URL to be called upon job completion. You may embed a specific job ID into this URL using $jobId. This tag will be replaced in the callback URL with this job’s job ID.OptionalNone
usehcatalogSpecify that the submitted job uses HCatalog and therefore needs to access the metastore, which requires additional steps for WebHCat to perform in a secure cluster. (See HIVE-5133.) This parameter will be introduced in Hive 0.13.0. It can also be set to “true” by including -useHCatalog in the arg parameter. Also, if webhcat-site.xml defines the parameters templeton.hive.archive, templeton.hive.home and templeton.hcat.home then WebHCat will ship the Hive tar to the target node where the job runs. (See HIVE-5547.) This means that Hive doesn’t need to be installed on every node in the Hadoop cluster. It does not ensure that Pig is installed on the target node in the cluster. This is independent of security, but improves manageability. The webhcat-site.xml parameters are documented in webhcat-default.xml.Optional in Hive 0.13.0+false

The standard parameters are also supported.

Apache Hive : WebHCat Reference PostTable

Description

Rename an HCatalog table.

URL

http://www.myserver.com/templeton/v1/ddl/database/:db/table/:table

Parameters

NameDescriptionRequired?Default
:dbThe database nameRequiredNone
:tableThe existing (old) table nameRequiredNone
renameThe new table nameRequiredNone
groupThe user group to useOptionalNone
permissionsThe permissions string to use. The format is “rwxrw-r-x”.OptionalNone

The standard parameters are also supported.

Results

NameDescription
tableThe new table name
databaseThe database name

Example

Curl Command

% curl -s -d rename=test_table_2 \
       'http://localhost:50111/templeton/v1/ddl/database/default/table/test_table?user.name=ekoifman'

Version information

Apache Hive : WebHCat Reference PutColumn

Description

Create a column in an HCatalog table.

URL

http://www.myserver.com/templeton/v1/ddl/database/:db/table/:table/column/:column

Parameters

NameDescriptionRequired?Default
:dbThe database nameRequiredNone
:tableThe table nameRequiredNone
:columnThe column nameRequiredNone
groupThe user group to useOptionalNone
permissionsThe permissions string to useOptionalNone
typeThe type of column to add, like “string” or “int”RequiredNone
commentThe column comment, like a descriptionOptionalNone

The standard parameters are also supported.

Apache Hive : WebHCat Reference PutDB

Description

Create a database.

URL

http://www.myserver.com/templeton/v1/ddl/database/:db

Parameters

NameDescriptionRequired?Default
:dbThe database nameRequiredNone
groupThe user group to useOptionalNone
permissionsThe permissions string to useOptionalNone
locationThe database locationOptionalNone
commentA comment for the database, like a descriptionOptionalNone
propertiesThe database propertiesOptionalNone

The standard parameters are also supported.

Results

NameDescription
databaseThe database name

Example

Curl Command

% curl -s -X PUT -HContent-type:application/json \
       -d '{ "comment":"Hello there",
             "location":"hdfs://localhost:9000/user/hive/my_warehouse",
             "properties":{"a":"b"}}' \
       'http://localhost:50111/templeton/v1/ddl/database/newdb?user.name=rachel'

JSON Output

{
 "database":"newdb"
}

Navigation Links Previous: GET ddl/database/:db Next: DELETE ddl/database/:db

Apache Hive : WebHCat Reference PutPartition

Description

Create a partition in an HCatalog table.

URL

http://www.myserver.com/templeton/v1/ddl/database/:db/table/:table/partition/:partition

Parameters

NameDescriptionRequired?Default
:dbThe database nameRequiredNone
:tableThe table nameRequiredNone
:partitionThe partition name, col_name=‘value’ list. Be careful to properly encode the quote for http, for example, country=%27algeria%27.RequiredNone
groupThe user group to useOptionalNone
permissionsThe permissions string to useOptionalNone
locationThe location for partition creationRequiredNone
ifNotExistsIf true, return an error if the partition already exists.OptionalFalse

The standard parameters are also supported.

Apache Hive : WebHCat Reference PutProperty

Description

Add a single property on an HCatalog table. This will also reset an existing property.

URL

http://www.myserver.com/templeton/v1/ddl/database/:db/table/:table/property/:property

Parameters

NameDescriptionRequired?Default
:dbThe database nameRequiredNone
:tableThe table nameRequiredNone
:propertyThe property nameRequiredNone
groupThe user group to useOptionalNone
permissionsThe permissions string to useOptionalNone
valueThe property valueRequiredNone

The standard parameters are also supported.

Apache Hive : WebHCat Reference PutTable

Description

Create a new HCatalog table. For more information, please refer to the Hive documentation for CREATE TABLE.

URL

http://www.myserver.com/templeton/v1/ddl/database/:db/table/:table

Parameters

NameDescriptionRequired?Default
:dbThe database name.RequiredNone
:tableThe new table name.RequiredNone
groupThe user group to use when creating a table.OptionalNone
permissionsThe permissions string to use when creating a table.OptionalNone
externalAllows you to specify a location so that Hive does not use the default location for this table.Optionalfalse
ifNotExistsIf true, you will not receive an error if the table already exists.Optionalfalse
commentComment for the table.OptionalNone
columnsA list of column descriptions, including name, type, and an optional comment.OptionalNone
partitionedByA list of column descriptions used to partition the table. Like the columns parameter this is a list of name, type, and comment fields.OptionalNone
clusteredByAn object describing how to cluster the table including the parameters columnNames, sortedBy, and numberOfBuckets. The sortedBy parameter includes the parameters columnName and order (ASC for ascending or DESC for descending). For further information please refer to the examples below or to the Hive documentation.OptionalNone
formatStorage format description including parameters for rowFormat, storedAs, and storedBy. For further information please refer to the examples below or to the Hive documentation.OptionalNone
locationThe HDFS path.OptionalNone
tablePropertiesA list of table property names and values (key/value pairs).OptionalNone

The standard parameters are also supported.

Apache Hive : WebHCat Reference PutTableLike

Description

Create a new HCatalog table like an existing one.

URL

http://www.myserver.com/templeton/v1/ddl/database/:db/table/:existingtable/like/:newtable

Parameters

NameDescriptionRequired?Default
:dbThe database nameRequiredNone
:existingtableThe existing table nameRequiredNone
:newtableThe new table nameRequiredNone
groupThe user group to use when creating a tableOptionalNone
permissionsThe permissions string to use when creating a tableOptionalNone
externalAllows you to specify a location so that Hive does not use the default location for this table.Optionalfalse
ifNotExistsIf true, you will not receive an error if the table already exists.Optionalfalse
locationThe HDFS pathOptionalNone

The standard parameters are also supported.

Apache Hive : WebHCat Reference ResponseTypes

Description

Returns a list of the response types supported by WebHCat (Templeton).

URL

http://www.myserver.com/templeton/:version

Parameters

NameDescriptionRequired?Default
:versionThe WebHCat version number. (Currently this must be “v1”.)RequiredNone

The standard parameters are also supported.

Results

NameDescription
responseTypesA list of all supported response types

Example

Curl Command

% curl -s 'http://localhost:50111/templeton/v1'

JSON Output

{
  "responseTypes": [
    "application/json"
  ]
}

JSON Output (error)

{
  "error": "null for uri: http://localhost:50111/templeton/v2"
}

Navigation Links Previous: Reference: WebHCat Resources
Next: GET status

Apache Hive : WebHCat Reference Status

Description

Returns the current status of the WebHCat (Templeton) server. Useful for heartbeat monitoring.

URL

http://www.myserver.com/templeton/v1/status

Parameters

Only the standard parameters are accepted.

Results

NameDescription
status“ok” if the WebHCat server was contacted.
versionString containing the version number similar to “v1”.

Example

Curl Command

% curl -s 'http://localhost:50111/templeton/v1/status'

JSON Output

{
 "status": "ok",
 "version": "v1"
}

Navigation Links Previous: Response Types (GET :version)Next: GET version

Apache Hive : WebHCat Reference Version

Description

Returns a list of supported versions and the current version.

URL

http://www.myserver.com/templeton/v1/version

Parameters

Only the standard parameters are accepted.

Results

NameDescription
supportedVersionsA list of all supported versions.
versionThe current version.

Example

Curl Command

% curl -s 'http://localhost:50111/templeton/v1/version'

JSON Output

{
 "supportedVersions": [
   "v1"
 ],
 "version": "v1"
}

Navigation Links Previous: GET status
Next: GET version/hive

Apache Hive : WebHCat Reference VersionHadoop

Description

Return the version of Hadoop being run when WebHCat creates a MapReduce job (POST mapreduce/jar or mapreduce/streaming).

Version: Hive 0.13.0 and later

GET version/hadoop is introduced in Hive release 0.13.0 (HIVE-6226).

URL

http://www.myserver.com/templeton/v1/version/hadoop

Parameters

Only the standard parameters are accepted.

Results

Returns the Hadoop version.

Example

Curl Command

% curl -s 'http://localhost:50111/templeton/v1/version/hadoop?user.name=ekoifman'

JSON Output

[
{"module":"hadoop","version":"2.4.1-SNAPSHOT}
]

 

 

 

Navigation Links Previous: GET version/hive
Next: POST ddl

Replaces deprecated resource: GET queue

Apache Hive : WebHCat Reference VersionHive

Description

Return the version of Hive being run when WebHCat issues Hive queries or commands (POST hive).

Version: Hive 0.13.0 and later

GET version/hive is introduced in Hive release 0.13.0 (HIVE-6226).

URL

http://www.myserver.com/templeton/v1/version/hive

Parameters

Only the standard parameters are accepted.

Results

Returns the Hive version.

Example

Curl Command

% curl -s 'http://localhost:50111/templeton/v1/version/hive?user.name=ekoifman'

JSON Output

[
{"module":"hive","version":"0.14.0-SNAPSHOT"}
]

 

 

 

Navigation Links Previous: GET version
Next: GET version/hadoop

Replaces deprecated resource: GET queue

Apache Hive : WebHCat UsingWebHCat

Version information

The HCatalog project graduated from the Apache incubator and merged with the Hive project on March 26, 2013.
Hive version 0.11.0 is the first release that includes HCatalog and its REST API, WebHCat.

Introduction to WebHCat

This document describes the HCatalog REST API, WebHCat, which was previously called Templeton.

As shown in the figure below, developers make HTTP requests to access Hadoop MapReduce (or YARN), Pig, Hive, and HCatalog DDL from within applications. Data and code used by this API are maintained in HDFS. HCatalog DDL commands are executed directly when requested. MapReduce, Pig, and Hive jobs are placed in queue by WebHCat (Templeton) servers and can be monitored for progress or stopped as required. Developers specify a location in HDFS into which Pig, Hive, and MapReduce results should be placed.