WebHCat (Templeton) Manual
Apache Hive : WebHCat
Dec 12, 2024Apache Hive : WebHCat
This is the manual for WebHCat, previously known as Templeton. WebHCat is the REST API for HCatalog, a table and storage management layer for Hadoop.
See the HCatalog Manual for general HCatalog documentation.
Navigation Links Next: Using WebHCat
Apache Hive : WebHCat Configure
Dec 12, 2024Apache Hive : WebHCat Configure
Configuration Files
The configuration for WebHCat (Templeton) merges the normal Hadoop configuration with the WebHCat-specific variables. Because WebHCat is designed to connect services that are not normally connected, the configuration is more complex than might be desirable.
The WebHCat-specific configuration is split into two layers:
- webhcat-default.xml – All the configuration variables that WebHCat needs. This file sets the defaults that ship with WebHCat and should only be changed by WebHCat developers. Do not copy this file or change it to maintain local installation settings. Because webhcat-default.xml is present in the WebHCat war file, editing a local copy of it will not change the configuration.
- webhcat-site.xml – The (possibly empty) configuration file in which the system administrator can set variables for their Hadoop cluster. Create this file and maintain entries in it for configuration variables that require you to override default values based on your local installation.
Note
Apache Hive : WebHCat InstallWebHCat
Dec 12, 2024Apache Hive : WebHCat Installation
WebHCat Installed with Hive
WebHCat and HCatalog are installed with Hive, starting with Hive release 0.11.0.
If you install Hive from the binary tarball, the WebHCat server command webhcat_server.sh is in the hcatalog/sbin directory.
Hive installation is documented here.
WebHCat Installation Procedure
Note: WebHCat was originally called Templeton. For backward compatibility the name still appears in URLs, log file names, variable names, etc.
- Ensure that the required related installations are in place, and place required files into the Hadoop distributed cache.
- Download and unpack the HCatalog distribution.
- Set the
TEMPLETON_HOMEenvironment variable to the base of the HCatalog REST server installation. This will usually be same asHCATALOG_HOME. This is used to find the WebHCat (Templeton) configuration. - Set
JAVA_HOME,HADOOP_PREFIX, andHIVE_HOMEenvironment variables. - Review the configuration and update or create
webhcat-site.xmlas required. Ensure that site-specific component installation locations are accurate, especially the Hadoop configuration path. Configuration variables that use a filesystem path try to have reasonable defaults, but it’s always safe to specify a full and complete path. - Verify that HCatalog is installed and that the
hcatexecutable is in thePATH. - Build HCatalog using the command
ant jarfrom the top level HCatalog directory. - Start the REST server with the command “
hcatalog/sbin/webhcat_server.sh start” for Hive 0.11.0 releases and later, or “sbin/webhcat_server.sh start” for installations prior to HCatalog merging with Hive. - Check that your local install works. Assuming that the server is running on port 50111, the following command would give output similar to that shown.
% curl -i http://localhost:50111/templeton/v1/status
HTTP/1.1 200 OK
Content-Type: application/json
Transfer-Encoding: chunked
Server: Jetty(7.6.0.v20120127)
{"status":"ok","version":"v1"}
%
Server Commands
- Start the server:
sbin/webhcat_server.sh start(HCatalog 0.5.0 and earlier – prior to Hive release 0.11.0)hcatalog/sbin/webhcat_server.sh start(Hive release 0.11.0 and later)
- Stop the server:
sbin/webhcat_server.sh stop(HCatalog 0.5.0 and earlier – prior to Hive release 0.11.0)hcatalog/sbin/webhcat_server.sh stop(Hive release 0.11.0 and later)
- End-to-end build, run, test:
ant e2e
Requirements
- Ant, version 1.8 or higher
- Hadoop, version 1.0.3 or higher
- ZooKeeper is required if you are using the ZooKeeper storage class. (Be sure to review and update the ZooKeeper-related WebHCat configuration.)
- HCatalog, version 0.5.0 or higher. The
hcatexecutable must be both in thePATHand properly configured in the WebHCat configuration. - Permissions must be given to the user running the server. (See below.)
- If running a secure cluster, Kerberos keys and principals must be created. (See below.)
- Hadoop Distributed Cache. To use Hive, Pig, or Hadoop Streaming resources, see instructions below for placing the required files in the Hadoop Distributed Cache.
Hadoop Distributed Cache
The server requires some files be accessible on the Hadoop distributed cache. For example, to avoid the installation of Pig and Hive everywhere on the cluster, the server gathers a version of Pig or Hive from the Hadoop distributed cache whenever those resources are invoked. After placing the following components into HDFS please update the site configuration as required for each.
Apache Hive : WebHCat Reference
Dec 12, 2024Apache Hive : WebHCat Reference
Reference: WebHCat Resources
This overview page lists all of the WebHCat resources. (DDL resources are listed here and on another overview page. For information about HCatalog DDL commands, see HCatalog DDL. For information about Hive DDL commands, see Hive Data Definition Language.)
| Category | Resource (Type) | Description |
|---|---|---|
| General | :version (GET) | Return a list of supported response types. |
| status (GET) | Return the WebHCat server status. | |
| version (GET) | Return a list of supported versions and the current version. | |
| version/hive (GET) | Return the Hive version being run. (Added in Hive 0.13.0.) | |
| version/hadoop (GET) | Return the Hadoop version being run. (Added in Hive 0.13.0.) | |
| DDL | ddl (POST) | Perform an HCatalog DDL command. |
| ddl/database (GET) | List HCatalog databases. | |
| ddl/database/:db (GET) | Describe an HCatalog database. | |
| ddl/database/:db (PUT) | Create an HCatalog database. | |
| ddl/database/:db (DELETE) | Delete (drop) an HCatalog database. | |
| ddl/database/:db/table (GET) | List the tables in an HCatalog database. | |
| ddl/database/:db/table/:table (GET) | Describe an HCatalog table. | |
| ddl/database/:db/table/:table (PUT) | Create a new HCatalog table. | |
| ddl/database/:db/table/:table (POST) | Rename an HCatalog table. | |
| ddl/database/:db/table/:table (DELETE) | Delete (drop) an HCatalog table. | |
| ddl/database/:db/table/:existingtable/like/:newtable (PUT) | Create a new HCatalog table like an existing one. | |
| ddl/database/:db/table/:table/partition (GET) | List all partitions in an HCatalog table. | |
| ddl/database/:db/table/:table/partition/:partition (GET) | Describe a single partition in an HCatalog table. | |
| ddl/database/:db/table/:table/partition/:partition (PUT) | Create a partition in an HCatalog table. | |
| ddl/database/:db/table/:table/partition/:partition (DELETE) | Delete (drop) a partition in an HCatalog table. | |
| ddl/database/:db/table/:table/column (GET) | List the columns in an HCatalog table. | |
| ddl/database/:db/table/:table/column/:column (GET) | Describe a single column in an HCatalog table. | |
| ddl/database/:db/table/:table/column/:column (PUT) | Create a column in an HCatalog table. | |
| ddl/database/:db/table/:table/property (GET) | List table properties. | |
| ddl/database/:db/table/:table/property/:property (GET) | Return the value of a single table property. | |
| ddl/database/:db/table/:table/property/:property (PUT) | Set a table property. | |
| MapReduce | mapreduce/streaming (POST) | Create and queue Hadoop streaming MapReduce jobs. |
| mapreduce/jar (POST) | Create and queue standard Hadoop MapReduce jobs. | |
| Pig | pig (POST) | Create and queue Pig jobs. |
| Hive | hive (POST) | Run Hive queries and commands. |
| Queue(deprecated in Hive 0.12,removed in Hive 0.14) | queue (GET) | Return a list of all job IDs. (Removed in Hive 0.14.0.) |
| queue/:jobid (GET) | Return the status of a job given its ID. (Removed in Hive 0.14.0.) | |
| queue/:jobid (DELETE) | Kill a job given its ID. (Removed in Hive 0.14.0.) | |
| Jobs(Hive 0.12 and later) | jobs (GET) | Return a list of all job IDs. |
| jobs/:jobid (GET) | Return the status of a job given its ID. | |
| jobs/:jobid (DELETE) | Kill a job given its ID. |
Navigation Links
Previous: Configuration
Next: GET :version
Apache Hive : WebHCat Reference AllDDL
Dec 12, 2024Apache Hive : WebHCat Reference AllDDL
WebHCat Reference: DDL Resources
This is an overview page for the WebHCat DDL resources. The full list of WebHCat resources is on this overview page.
- For information about HCatalog DDL commands, see HCatalog DDL.
- For information about Hive DDL commands, see Hive Data Definition Language.
Navigation Links Previous: GET version Next: POST ddl
Apache Hive : WebHCat Reference DDL
Dec 12, 2024Apache Hive : WebHCat Reference DDL
Description
Performs an HCatalog DDL command. The command is executed immediately upon request. Responses are limited to 1 MB. For requests which may return longer results consider using the Hive resource as an alternative.
URL
http://www.myserver.com/templeton/ddl
Parameters
| Name | Description | Required? | Default |
|---|---|---|---|
| exec | The HCatalog ddl string to execute | Required | None |
| group | The user group to use when creating a table | Optional | None |
| permissions | The permissions string to use when creating a table. The format is “rwxrw-r-x”. | Optional | None |
The standard parameters are also supported.
Apache Hive : WebHCat Reference DeleteDB
Dec 12, 2024Apache Hive : WebHCat Reference DeleteDB
Description
Delete a database.
URL
http://www.myserver.com/templeton/v1/ddl/database/:db
Parameters
| Name | Description | Required? | Default |
|---|---|---|---|
| :db | The database name | Required | None |
| ifExists | Hive returns an error if the database specified does not exist, unless ifExists is set to true. | Optional | false |
| option | Parameter set to either “restrict” or “cascade”. Restrict will remove the schema if all the tables are empty. Cascade removes everything including data and definitions. | Optional | None |
| group | The user group to use | Optional | None |
| permissions | The permissions string to use. The format is “rwxrw-r-x”. | Optional | None |
The standard parameters are also supported.
Apache Hive : WebHCat Reference DeleteJob
Dec 12, 2024Apache Hive : WebHCat Reference DeleteJob
Description
Kill a job given its job ID. Substitute “:jobid” with the job ID received when the job was created.
Version: Deprecated in 0.12.0
DELETE queue/:jobid is deprecated starting in Hive release 0.12.0. Users are encouraged to use DELETE jobs/:jobid instead. (See HIVE-4443.)DELETE queue/:jobid is equivalent to DELETE jobs/:jobid – check [DELETE jobs/:jobid](https://hive.apache.org/docs/latest/webhcat/webhcat-reference-deletejobid/) for documentation.
Version: Obsolete in 0.14.0
DELETE queue/:jobid will be removed in Hive release 0.14.0. (See HIVE-6432.)
Use [DELETE jobs/:jobid](https://hive.apache.org/docs/latest/webhcat/webhcat-reference-deletejobid/) instead.
Apache Hive : WebHCat Reference DeleteJobID
Dec 12, 2024Apache Hive : WebHCat Reference DeleteJobID
Description
Kill a job given its job ID. Substitute “:jobid” with the job ID received when the job was created.
Version: Hive 0.12.0 and later
DELETE jobs/:jobid is introduced in Hive release 0.12.0. It is equivalent to [DELETE queue/:jobid](https://hive.apache.org/docs/latest/webhcat/webhcat-reference-deletejob/) in prior releases.DELETE queue/:jobid is now deprecated (HIVE-4443) and will be removed in Hive 0.14.0 (HIVE-6432).
URL
http://www.myserver.com/templeton/v1/jobs/:jobid
Parameters
| Name | Description | Required? | Default |
|---|---|---|---|
| :jobid | The job ID to delete. This is the ID received when the job was created. | Required | None |
The standard parameters are also supported.
Apache Hive : WebHCat Reference DeletePartition
Dec 12, 2024Apache Hive : WebHCat Reference DeletePartition
Description
Delete (drop) a partition in an HCatalog table.
URL
http://www.myserver.com/templeton/v1/ddl/database/:db/table/:table/partition/:partition
Parameters
| Name | Description | Required? | Default |
|---|---|---|---|
| :db | The database name | Required | None |
| :table | The table name | Required | None |
| :partition | The partition name, col_name=‘value’ list. Be careful to properly encode the quote for http, for example, country=%27algeria%27. | Required | None |
| ifExists | Hive returns an error if the partition specified does not exist, unless ifExists is set to true. | Optional | false |
| group | The user group to use | Optional | None |
| permissions | The permissions string to use. The format is “rwxrw-r-x”. | Optional | None |
The standard parameters are also supported.
Apache Hive : WebHCat Reference DeleteTable
Dec 12, 2024Apache Hive : WebHCat Reference DeleteTable
Description
Delete (drop) an HCatalog table.
URL
http://www.myserver.com/templeton/v1/ddl/database/:db/table/:table
Parameters
| Name | Description | Required? | Default |
|---|---|---|---|
| :db | The database name | Required | None |
| :table | The table name | Required | None |
| ifExists | Hive 0.70 and later returns an error if the table specified does not exist, unless ifExists is set to true. | Optional | false |
| group | The user group to use | Optional | None |
| permissions | The permissions string to use. The format is “rwxrw-r-x”. | Optional | None |
The standard parameters are also supported.
Apache Hive : WebHCat Reference GetColumn
Dec 12, 2024Apache Hive : WebHCat Reference GetColumn
Description
Describe a single column in an HCatalog table.
URL
http://www.myserver.com/templeton/v1/ddl/database/:db/table/:table/column/:column
Parameters
| Name | Description | Required? | Default |
|---|---|---|---|
| :db | The database name | Required | None |
| :table | The table name | Required | None |
| :column | The column name | Required | None |
The standard parameters are also supported.
Results
| Name | Description |
|---|---|
| database | The database name |
| table | The table name |
| column | A JSON object containing the column name, type, and comment (if any) |
Example
Curl Command
% curl -s 'http://localhost:50111/templeton/v1/ddl/database/default/table/test_table/column/price?user.name=ctdean'
JSON Output
{
"database": "default",
"table": "test_table",
"column": {
"name": "price",
"comment": "The unit price",
"type": "float"
}
}
Navigation Links Previous: GET ddl/database/:db/table/:table/column Next: PUT ddl/database/:db/table/:table/column/:column
Apache Hive : WebHCat Reference GetColumns
Dec 12, 2024Apache Hive : WebHCat Reference GetColumns
Description
List the columns in an HCatalog table.
URL
http://www.myserver.com/templeton/v1/ddl/database/:db/table/:table/column
Parameters
| Name | Description | Required? | Default |
|---|---|---|---|
| :db | The database name | Required | None |
| :table | The table name | Required | None |
The standard parameters are also supported.
Results
| Name | Description |
|---|---|
| columns | A list of column names and types |
| database | The database name |
| table | The table name |
Example
Curl Command
% curl -s 'http://localhost:50111/templeton/v1/ddl/database/default/table/my_table/column?user.name=ctdean'
JSON Output
{
"columns": [
{
"name": "id",
"type": "bigint"
},
{
"name": "user",
"comment": "The user name",
"type": "string"
},
{
"name": "my_p",
"type": "string"
},
{
"name": "my_q",
"type": "string"
}
],
"database": "default",
"table": "my_table"
}
Navigation Links Previous: DELETE ddl/database/:db/table/:table/partition/:partition Next: GET ddl/database/:db/table/:table/column/:column
Apache Hive : WebHCat Reference GetDB
Dec 12, 2024Apache Hive : WebHCat Reference GetDB
Description
Describe a database. (Note: This resource has a “format=extended” parameter however the output structure does not change if it is used.)
URL
http://www.myserver.com/templeton/v1/ddl/database/:db
Parameters
| Name | Description | Required? | Default |
|---|---|---|---|
| :db | The database name | Required | None |
The standard parameters are also supported.
Results
| Name | Description |
|---|---|
| location | The database location |
| params | The database parameters |
| comment | The database comment |
| database | The database name |
Example
Curl Command
% curl -s 'http://localhost:50111/templeton/v1/ddl/database/newdb?user.name=ctdean'
JSON Output
{
"location":"hdfs://localhost:9000/warehouse/newdb.db",
"params":"{a=b}",
"comment":"Hello there",
"database":"newdb"
}
JSON Output (error)
{
"error": "No such database: newdb",
"errorCode": 404
}
Navigation Links Previous: GET ddl/database Next: PUT ddl/database/:db
Apache Hive : WebHCat Reference GetDBs
Dec 12, 2024Apache Hive : WebHCat Reference GetDBs
Description
List the databases in HCatalog.
URL
http://www.myserver.com/templeton/v1/ddl/database
Parameters
| Name | Description | Required? | Default |
|---|---|---|---|
| like | List only databases whose names match the specified pattern. | Optional | “*” (List all) |
The standard parameters are also supported.
Results
| Name | Description |
|---|---|
| databases | A list of database names. |
Example
Curl Command
% curl -s 'http://localhost:50111/templeton/v1/ddl/database?user.name=ctdean&like=n*'
JSON Output
{
"databases": [
"newdb",
"newdb2"
]
}
Navigation Links
Previous: POST ddl
Next: GET ddl/database/:db
Apache Hive : WebHCat Reference GetPartition
Dec 12, 2024Apache Hive : WebHCat Reference GetPartition
Description
Describe a single partition in an HCatalog table.
URL
http://www.myserver.com/templeton/v1/ddl/database/:db/table/:table/partition/:partition
Parameters
| Name | Description | Required? | Default |
|---|---|---|---|
| :db | The database name | Required | None |
| :table | The table name | Required | None |
| :partition | The partition name, col_name=‘value’ list. Be careful to properly encode the quote for http, for example, country=%27algeria%27. | Required | None |
The standard parameters are also supported.
Results
| Name | Description |
|---|---|
| database | The database name |
| table | The table name |
| partition | The partition name |
| partitioned | True if the table is partitioned |
| location | Location of table |
| outputFormat | Output format |
| columns | List of column names, types, and comments |
| owner | The owner’s user name |
| partitionColumns | List of the partition columns |
| inputFormat | Input format |
Example
Curl Command
% curl -s \
'http://localhost:50111/templeton/v1/ddl/database/default/table/mytest/partition/country=%27US%27?user.name=ctdean'
JSON Output
{
"partitioned": true,
"location": "hdfs://ip-10-77-6-151.ec2.internal:8020/apps/hive/warehouse/mytest/loc1",
"outputFormat": "org.apache.hadoop.hive.ql.io.RCFileOutputFormat",
"columns": [
{
"name": "i",
"type": "int"
},
{
"name": "j",
"type": "bigint"
},
{
"name": "ip",
"comment": "IP Address of the User",
"type": "string"
}
],
"owner": "rachel",
"partitionColumns": [
{
"name": "country",
"type": "string"
}
],
"inputFormat": "org.apache.hadoop.hive.ql.io.RCFileInputFormat",
"database": "default",
"table": "mytest",
"partition": "country='US'"
}
Navigation Links Previous: GET ddl/database/:db/table/:table/partition Next: PUT ddl/database/:db/table/:table/partition/:partition
Apache Hive : WebHCat Reference GetPartitions
Dec 12, 2024Apache Hive : WebHCat Reference GetPartitions
Description
List all the partitions in an HCatalog table.
URL
http://www.myserver.com/templeton/v1/ddl/database/:db/table/:table/partition
Parameters
| Name | Description | Required? | Default |
|---|---|---|---|
| :db | The database name | Required | None |
| :table | The table name | Required | None |
The standard parameters are also supported.
Results
| Name | Description |
|---|---|
| partitions | A list of partition values and of partition names |
| database | The database name |
| table | The table name |
Example
Curl Command
% curl -s 'http://localhost:50111/templeton/v1/ddl/database/default/table/my_table/partition?user.name=ctdean'
JSON Output
{
"partitions": [
{
"values": [
{
"columnName": "dt",
"columnValue": "20120101"
},
{
"columnName": "country",
"columnValue": "US"
}
],
"name": "dt='20120101',country='US'"
}
],
"database": "default",
"table": "my_table"
}
Navigation Links Previous: PUT ddl/database/:db/table/:existingtable/like/:newtable Next: GET ddl/database/:db/table/:table/partition/:partition
Apache Hive : WebHCat Reference GetProperties
Dec 12, 2024Apache Hive : WebHCat Reference GetProperties
Description
List all the properties of an HCatalog table.
URL
http://www.myserver.com/templeton/v1/ddl/database/:db/table/:table/property
Parameters
| Name | Description | Required? | Default |
|---|---|---|---|
| :db | The database name | Required | None |
| :table | The table name | Required | None |
The standard parameters are also supported.
Results
| Name | Description |
|---|---|
| properties | A list of the table’s properties in name: value pairs |
| database | The database name |
| table | The table name |
Example
Curl Command
% curl -s 'http://localhost:50111/templeton/v1/ddl/database/default/table/test_table/property?user.name=ctdean'
JSON Output
{
"properties": {
"fruit": "apple",
"last_modified_by": "ctdean",
"hcat.osd": "org.apache.hcatalog.rcfile.RCFileOutputDriver",
"color": "blue",
"last_modified_time": "1331620706",
"hcat.isd": "org.apache.hcatalog.rcfile.RCFileInputDriver",
"transient_lastDdlTime": "1331620706",
"comment": "Best table made today",
"country": "Albania"
},
"table": "test_table",
"database": "default"
}
Navigation Links Previous: PUT ddl/database/:db/table/:table/column/:column Next: GET ddl/database/:db/table/:table/property/:property
Apache Hive : WebHCat Reference GetProperty
Dec 12, 2024Apache Hive : WebHCat Reference GetProperty
Description
Return the value of a single table property.
URL
http://www.myserver.com/templeton/v1/ddl/database/:db/table/:table/property/:property
Parameters
| Name | Description | Required? | Default |
|---|---|---|---|
| :db | The database name | Required | None |
| :table | The table name | Required | None |
| :property | The property name | Required | None |
The standard parameters are also supported.
Results
| Name | Description |
|---|---|
| property | The requested property’s name: value pair |
| database | The database name |
| table | The table name |
Example
Curl Command
% curl -s 'http://localhost:50111/templeton/v1/ddl/database/default/table/test_table/property/fruit?user.name=ctdean'
JSON Output
{
"property": {
"fruit": "apple"
},
"table": "test_table",
"database": "default"
}
JSON Output (error)
{
"error": "Table test_table does not exist",
"errorCode": 404,
"database": "default",
"table": "test_table"
}
Navigation Links Previous: GET ddl/database/:db/table/:table/property Next: PUT ddl/database/:db/table/:table/property/:property
Apache Hive : WebHCat Reference GetTable
Dec 12, 2024Apache Hive : WebHCat Reference GetTable
Description
Describe an HCatalog table. Normally returns a simple list of columns (using “desc table”), but the extended format will show more information (using “show table extended like”).
URL
http://www.myserver.com/templeton/v1/ddl/database/:db/table/:table
http://www.myserver.com/templeton/v1/ddl/database/:db/table/:table?format=extended
Parameters
| Name | Description | Required? | Default |
|---|---|---|---|
| :db | The database name | Required | None |
| :table | The table name | Required | None |
| format | Set “format=extended” to see additional information (using “show table extended like”) | Optional | Not extended |
The standard parameters are also supported.
Apache Hive : WebHCat Reference GetTables
Dec 12, 2024Apache Hive : WebHCat Reference GetTables
Description
List the tables in an HCatalog database.
URL
http://www.myserver.com/templeton/v1/ddl/database/:db/table
Parameters
| Name | Description | Required? | Default |
|---|---|---|---|
| :db | The database name | Required | None |
| like | List only tables whose names match the specified pattern | Optional | “*” (List all tables) |
The standard parameters are also supported.
Results
| Name | Description |
|---|---|
| tables | A list of table names |
| database | The database name |
Example
Curl Command
% curl -s 'http://localhost:50111/templeton/v1/ddl/database/default/table?user.name=ctdean&like=m*'
JSON Output
{
"tables": [
"my_table",
"my_table_2",
"my_table_3"
],
"database": "default"
}
JSON Output (error)
{
"errorDetail": "
org.apache.hadoop.hive.ql.metadata.HiveException: ERROR: The database defaultsd does not exist.
at org.apache.hadoop.hive.ql.exec.DDLTask.switchDatabase(DDLTask.java:3122)
at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:224)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:134)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1332)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1123)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:931)
at org.apache.hcatalog.cli.HCatDriver.run(HCatDriver.java:42)
at org.apache.hcatalog.cli.HCatCli.processCmd(HCatCli.java:247)
at org.apache.hcatalog.cli.HCatCli.processLine(HCatCli.java:203)
at org.apache.hcatalog.cli.HCatCli.main(HCatCli.java:162)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
",
"error": "FAILED: Error in metadata: ERROR: The database defaultsd does not exist.",
"errorCode": 500,
"database": "defaultsd"
}
Navigation Links Previous: DELETE ddl/database/:db Next: GET ddl/database/:db/table/:table
Apache Hive : WebHCat Reference Hive
Dec 12, 2024Apache Hive : WebHCat Reference Hive
Description
Runs a Hive query or set of commands.
Version: Hive 0.13.0 and later
As of Hive 0.13.0, GET version/hive displays the Hive version used for the query or commands.
URL
http://www.myserver.com/templeton/v1/hive
Parameters
| Name | Description | Required? | Default |
|---|---|---|---|
| execute | String containing an entire, short Hive program to run. | One of either “execute” or “file” is required. | None |
| file | HDFS file name of a Hive program to run. | One of either “execute” or “file” is required. | None |
| define | Set a Hive configuration variable using the syntax define=NAME=VALUE. See a note CURL and “=”. | Optional | None |
| arg | Set a program argument. This parameter was introduced in Hive 0.12.0. (See HIVE-4444.) | Optional in Hive 0.12.0+ | None |
| files | Comma-separated files to be copied to the map reduce cluster. This parameter was introduced in Hive 0.12.0. (See HIVE-4444.) | Optional in Hive 0.12.0+ | None |
| statusdir | A directory where WebHCat will write the status of the Hive job. If provided, it is the caller’s responsibility to remove this directory when done. | Optional | None |
| enablelog | If statusdir is set and enablelog is “true”, collect Hadoop job configuration and logs into a directory named $statusdir/logs after the job finishes. Both completed and failed attempts are logged. The layout of subdirectories in $statusdir/logs is: logs/$job_id (directory for $job_id) logs/$job_id/job.xml.html logs/$job_id/$attempt_id (directory for $attempt_id) logs/$job_id/$attempt_id/stderr logs/$job_id/$attempt_id/stdout logs/$job_id/$attempt_id/syslog This parameter was introduced in Hive 0.12.0. (See HIVE-4531.) | Optional in Hive 0.12.0+ | None |
| callback | Define a URL to be called upon job completion. You may embed a specific job ID into this URL using $jobId. This tag will be replaced in the callback URL with this job’s job ID. | Optional | None |
The standard parameters are also supported.
Apache Hive : WebHCat Reference Job
Dec 12, 2024Apache Hive : WebHCat Reference Job
Description
Check the status of a job and get related job information given its job ID. Substitute “:jobid” with the job ID received when the job was created.
Version: Hive 0.12.0 and later
GET jobs/:jobid is introduced in Hive release 0.12.0. It is equivalent to [GET queue/:jobid](https://hive.apache.org/docs/latest/webhcat/webhcat-reference-jobinfo/) in prior releases.GET queue/:jobid is now deprecated (HIVE-4443) and will be removed in Hive 0.14.0 (HIVE-6432).
Apache Hive : WebHCat Reference JobIDs
Dec 12, 2024Apache Hive : WebHCat Reference JobIDs
Description
Return a list of all job IDs.
Version: Deprecated in 0.12.0
GET queue is deprecated starting in Hive release 0.12.0. (See HIVE-4443.) Users are encouraged to use [GET jobs](https://hive.apache.org/docs/latest/webhcat/webhcat-reference-jobs/) instead.
Version: Obsolete in 0.14.0
GET queue will be removed in Hive release 0.14.0. (See HIVE-6432.)
Use [GET jobs](https://hive.apache.org/docs/latest/webhcat/webhcat-reference-jobs/) instead.
URL
http://www.myserver.com/templeton/v1/queue
Parameters
| Name | Description | Required? | Default |
|---|---|---|---|
| showall | If showall is set to “true”, then the request will return all jobs the user has permission to view, not only the jobs belonging to the user. This parameter is not available in releases prior to Hive 0.12.0. (See HIVE-4442.) | Optional in Hive 0.12.0+ | false |
The standard parameters are also accepted.
Apache Hive : WebHCat Reference JobInfo
Dec 12, 2024Apache Hive : WebHCat Reference JobInfo
Description
Check the status of a job and get related job information given its job ID. Substitute “:jobid” with the job ID received when the job was created.
Version: Deprecated in 0.12.0
GET queue/:jobid is deprecated starting in Hive release 0.12.0. Users are encouraged to use GET jobs/:jobid instead. (See HIVE-4443.)GET queue/:jobid is equivalent to GET jobs/:jobid – check [GET jobs/:jobid](https://hive.apache.org/docs/latest/webhcat/webhcat-reference-job/) for documentation.
Version: Obsolete in 0.14.0
Apache Hive : WebHCat Reference Jobs
Dec 12, 2024Apache Hive : WebHCat Reference Jobs
Description
Return a list of all job IDs.
Version: Hive 0.12.0 and later
GET jobs is introduced in Hive release 0.12.0. It is equivalent to [GET queue](https://hive.apache.org/docs/latest/webhcat/webhcat-reference-jobids/) in prior releases.GET queue is now deprecated (HIVE-4443) and will be removed in Hive 0.14.0 (HIVE-6432).
URL
http://www.myserver.com/templeton/v1/jobs
Parameters
| Name | Description | Required? | Default |
|---|---|---|---|
| fields | If fields set to “”, the request will return full details of the job. If fields is missing, will only return the job ID. Currently the value can only be “”, other values are not allowed and will throw exception. | Optional | None |
| showall | If showall is set to “true”, the request will return all jobs the user has permission to view, not only the jobs belonging to the user. | Optional | false |
| jobid | If jobid is present, only the records whose job ID is lexicographically greater than jobid are returned. For example, if jobid = “job_201312091733_0001”, the jobs whose job ID is greater than “job_201312091733_0001” are returned. The number of records returned depends on the value of numrecords.This parameter is not available in releases prior to Hive 0.13.0. (See HIVE-5519.) | Optional in Hive 0.13.0+ | None |
| numrecords | If the jobid and numrecords parameters are present, the top numrecords records appearing after jobid will be returned after sorting the job ID list lexicographically. If the jobid parameter is missing and numrecords is present, the top numrecords will be returned after lexicographically sorting the job ID list. If the jobid parameter is present and numrecords is missing, all the records whose job ID is greater than jobid are returned.This parameter is not available in releases prior to Hive 0.13.0. (See HIVE-5519.) | Optional in Hive 0.13.0+ | All |
The standard parameters are also accepted.
Apache Hive : WebHCat Reference MapReduceJar
Dec 12, 2024Apache Hive : WebHCat Reference MapReduceJar
Description
Creates and queues a standard Hadoop MapReduce job.
Version: Hive 0.13.0 and later
As of Hive 0.13.0, GET version/hadoop displays the Hadoop version used for the MapReduce job.
URL
http://www.myserver.com/templeton/v1/mapreduce/jar
Parameters
| Name | Description | Required? | Default |
|---|---|---|---|
| jar | Name of the jar file for Map Reduce to use. | Required | None |
| class | Name of the class for Map Reduce to use. | Required | None |
| libjars | Comma separated jar files to include in the classpath. | Optional | None |
| files | Comma separated files to be copied to the map reduce cluster. | Optional | None |
| arg | Set a program argument. | Optional | None |
| define | Set a Hadoop configuration variable using the syntax define=NAME=VALUE | Optional | None |
| statusdir | A directory where WebHCat will write the status of the Map Reduce job. If provided, it is the caller’s responsibility to remove this directory when done. | Optional | None |
| enablelog | If statusdir is set and enablelog is “true”, collect Hadoop job configuration and logs into a directory named $statusdir/logs after the job finishes. Both completed and failed attempts are logged. The layout of subdirectories in $statusdir/logs is: logs/$job_id (directory for $job_id) logs/$job_id/job.xml.html logs/$job_id/$attempt_id (directory for $attempt_id) logs/$job_id/$attempt_id/stderr logs/$job_id/$attempt_id/stdout logs/$job_id/$attempt_id/syslog This parameter was introduced in Hive 0.12.0. (See HIVE-4531.) | Optional in Hive 0.12.0+ | None |
| callback | Define a URL to be called upon job completion. You may embed a specific job ID into this URL using $jobId. This tag will be replaced in the callback URL with this job’s job ID. | Optional | None |
| usehcatalog | Specify that the submitted job uses HCatalog and therefore needs to access the metastore, which requires additional steps for WebHCat to perform in a secure cluster. (See HIVE-5133.) This parameter will be introduced in Hive 0.13.0. Also, if webhcat-site.xml defines the parameters templeton.hive.archive, templeton.hive.home and templeton.hcat.home then WebHCat will ship the Hive tar to the target node where the job runs. (See HIVE-5547.) This means that Hive doesn’t need to be installed on every node in the Hadoop cluster. This is independent of security, but improves manageability. The webhcat-site.xml parameters are documented in webhcat-default.xml. | Optional in Hive 0.13.0+ | false |
The standard parameters are also supported.
Apache Hive : WebHCat Reference MapReduceStream
Dec 12, 2024Apache Hive : WebHCat Reference MapReduceStream
Description
Create and queue a Hadoop streaming MapReduce job.
Version: Hive 0.13.0 and later
As of Hive 0.13.0, GET version/hadoop displays the Hadoop version used for the MapReduce job.
URL
http://www.myserver.com/templeton/v1/mapreduce/streaming
Parameters
| Name | Description | Required? | Default |
|---|---|---|---|
| input | Location of the input data in Hadoop. | Required | None |
| output | Location in which to store the output data. If not specified, WebHCat will store the output in a location that can be discovered using the queue resource. | Optional | See description |
| mapper | Location of the mapper program in Hadoop. | Required | None |
| reducer | Location of the reducer program in Hadoop. | Required | None |
| file | Add an HDFS file to the distributed cache. | Optional | None |
| define | Set a Hadoop configuration variable using the syntax define=NAME=VALUE | Optional | None |
| cmdenv | Set an environment variable using the syntax cmdenv=NAME=VALUE | Optional | None |
| arg | Set a program argument. | Optional | None |
| statusdir | A directory where WebHCat will write the status of the Map Reduce job. If provided, it is the caller’s responsibility to remove this directory when done. | Optional | None |
| enablelog | If statusdir is set and enablelog is “true”, collect Hadoop job configuration and logs into a directory named $statusdir/logs after the job finishes. Both completed and failed attempts are logged. The layout of subdirectories in $statusdir/logs is: logs/$job_id (directory for $job_id) logs/$job_id/job.xml.html logs/$job_id/$attempt_id (directory for $attempt_id) logs/$job_id/$attempt_id/stderr logs/$job_id/$attempt_id/stdout logs/$job_id/$attempt_id/syslog This parameter was introduced in Hive 0.12.0. (See HIVE-4531.) | Optional in Hive 0.12.0+ | None |
| callback | Define a URL to be called upon job completion. You may embed a specific job ID into this URL using $jobId. This tag will be replaced in the callback URL with this job’s job ID. | Optional | None |
The standard parameters are also supported.
Apache Hive : WebHCat Reference Pig
Dec 12, 2024Apache Hive : WebHCat Reference Pig
Description
Create and queue a Pig job.
URL
http://www.myserver.com/templeton/v1/pig
Parameters
| Name | Description | Required? | Default |
|---|---|---|---|
| execute | String containing an entire, short Pig program to run. | One of either “execute” or “file” is required. | None |
| file | HDFS file name of a Pig program to run. | One of either “execute” or “file” is required. | None |
| arg | Set a program argument. If -useHCatalog is included, then usehcatalog is interpreted as “true” (Hive 0.13.0 and later). | Optional | None |
| files | Comma separated files to be copied to the map reduce cluster. | Optional | None |
| statusdir | A directory where WebHCat will write the status of the Pig job. If provided, it is the caller’s responsibility to remove this directory when done. | Optional | None |
| enablelog | If statusdir is set and enablelog is “true”, collect Hadoop job configuration and logs into a directory named $statusdir/logs after the job finishes. Both completed and failed attempts are logged. The layout of subdirectories in $statusdir/logs is: logs/$job_id (directory for $job_id) logs/$job_id/job.xml.html logs/$job_id/$attempt_id (directory for $attempt_id) logs/$job_id/$attempt_id/stderr logs/$job_id/$attempt_id/stdout logs/$job_id/$attempt_id/syslog This parameter was introduced in Hive 0.12.0. (See HIVE-4531.) | Optional in Hive 0.12.0+ | None |
| callback | Define a URL to be called upon job completion. You may embed a specific job ID into this URL using $jobId. This tag will be replaced in the callback URL with this job’s job ID. | Optional | None |
| usehcatalog | Specify that the submitted job uses HCatalog and therefore needs to access the metastore, which requires additional steps for WebHCat to perform in a secure cluster. (See HIVE-5133.) This parameter will be introduced in Hive 0.13.0. It can also be set to “true” by including -useHCatalog in the arg parameter. Also, if webhcat-site.xml defines the parameters templeton.hive.archive, templeton.hive.home and templeton.hcat.home then WebHCat will ship the Hive tar to the target node where the job runs. (See HIVE-5547.) This means that Hive doesn’t need to be installed on every node in the Hadoop cluster. It does not ensure that Pig is installed on the target node in the cluster. This is independent of security, but improves manageability. The webhcat-site.xml parameters are documented in webhcat-default.xml. | Optional in Hive 0.13.0+ | false |
The standard parameters are also supported.
Apache Hive : WebHCat Reference PostTable
Dec 12, 2024Apache Hive : WebHCat Reference PostTable
Description
Rename an HCatalog table.
URL
http://www.myserver.com/templeton/v1/ddl/database/:db/table/:table
Parameters
| Name | Description | Required? | Default |
|---|---|---|---|
| :db | The database name | Required | None |
| :table | The existing (old) table name | Required | None |
| rename | The new table name | Required | None |
| group | The user group to use | Optional | None |
| permissions | The permissions string to use. The format is “rwxrw-r-x”. | Optional | None |
The standard parameters are also supported.
Results
| Name | Description |
|---|---|
| table | The new table name |
| database | The database name |
Example
Curl Command
% curl -s -d rename=test_table_2 \
'http://localhost:50111/templeton/v1/ddl/database/default/table/test_table?user.name=ekoifman'
Version information
Apache Hive : WebHCat Reference PutColumn
Dec 12, 2024Apache Hive : WebHCat Reference PutColumn
Description
Create a column in an HCatalog table.
URL
http://www.myserver.com/templeton/v1/ddl/database/:db/table/:table/column/:column
Parameters
| Name | Description | Required? | Default |
|---|---|---|---|
| :db | The database name | Required | None |
| :table | The table name | Required | None |
| :column | The column name | Required | None |
| group | The user group to use | Optional | None |
| permissions | The permissions string to use | Optional | None |
| type | The type of column to add, like “string” or “int” | Required | None |
| comment | The column comment, like a description | Optional | None |
The standard parameters are also supported.
Apache Hive : WebHCat Reference PutDB
Dec 12, 2024Apache Hive : WebHCat Reference PutDB
Description
Create a database.
URL
http://www.myserver.com/templeton/v1/ddl/database/:db
Parameters
| Name | Description | Required? | Default |
|---|---|---|---|
| :db | The database name | Required | None |
| group | The user group to use | Optional | None |
| permissions | The permissions string to use | Optional | None |
| location | The database location | Optional | None |
| comment | A comment for the database, like a description | Optional | None |
| properties | The database properties | Optional | None |
The standard parameters are also supported.
Results
| Name | Description |
|---|---|
| database | The database name |
Example
Curl Command
% curl -s -X PUT -HContent-type:application/json \
-d '{ "comment":"Hello there",
"location":"hdfs://localhost:9000/user/hive/my_warehouse",
"properties":{"a":"b"}}' \
'http://localhost:50111/templeton/v1/ddl/database/newdb?user.name=rachel'
JSON Output
{
"database":"newdb"
}
Navigation Links Previous: GET ddl/database/:db Next: DELETE ddl/database/:db
Apache Hive : WebHCat Reference PutPartition
Dec 12, 2024Apache Hive : WebHCat Reference PutPartition
Description
Create a partition in an HCatalog table.
URL
http://www.myserver.com/templeton/v1/ddl/database/:db/table/:table/partition/:partition
Parameters
| Name | Description | Required? | Default |
|---|---|---|---|
| :db | The database name | Required | None |
| :table | The table name | Required | None |
| :partition | The partition name, col_name=‘value’ list. Be careful to properly encode the quote for http, for example, country=%27algeria%27. | Required | None |
| group | The user group to use | Optional | None |
| permissions | The permissions string to use | Optional | None |
| location | The location for partition creation | Required | None |
| ifNotExists | If true, return an error if the partition already exists. | Optional | False |
The standard parameters are also supported.
Apache Hive : WebHCat Reference PutProperty
Dec 12, 2024Apache Hive : WebHCat Reference PutProperty
Description
Add a single property on an HCatalog table. This will also reset an existing property.
URL
http://www.myserver.com/templeton/v1/ddl/database/:db/table/:table/property/:property
Parameters
| Name | Description | Required? | Default |
|---|---|---|---|
| :db | The database name | Required | None |
| :table | The table name | Required | None |
| :property | The property name | Required | None |
| group | The user group to use | Optional | None |
| permissions | The permissions string to use | Optional | None |
| value | The property value | Required | None |
The standard parameters are also supported.
Apache Hive : WebHCat Reference PutTable
Dec 12, 2024Apache Hive : WebHCat Reference PutTable
Description
Create a new HCatalog table. For more information, please refer to the Hive documentation for CREATE TABLE.
URL
http://www.myserver.com/templeton/v1/ddl/database/:db/table/:table
Parameters
| Name | Description | Required? | Default |
|---|---|---|---|
| :db | The database name. | Required | None |
| :table | The new table name. | Required | None |
| group | The user group to use when creating a table. | Optional | None |
| permissions | The permissions string to use when creating a table. | Optional | None |
| external | Allows you to specify a location so that Hive does not use the default location for this table. | Optional | false |
| ifNotExists | If true, you will not receive an error if the table already exists. | Optional | false |
| comment | Comment for the table. | Optional | None |
| columns | A list of column descriptions, including name, type, and an optional comment. | Optional | None |
| partitionedBy | A list of column descriptions used to partition the table. Like the columns parameter this is a list of name, type, and comment fields. | Optional | None |
| clusteredBy | An object describing how to cluster the table including the parameters columnNames, sortedBy, and numberOfBuckets. The sortedBy parameter includes the parameters columnName and order (ASC for ascending or DESC for descending). For further information please refer to the examples below or to the Hive documentation. | Optional | None |
| format | Storage format description including parameters for rowFormat, storedAs, and storedBy. For further information please refer to the examples below or to the Hive documentation. | Optional | None |
| location | The HDFS path. | Optional | None |
| tableProperties | A list of table property names and values (key/value pairs). | Optional | None |
The standard parameters are also supported.
Apache Hive : WebHCat Reference PutTableLike
Dec 12, 2024Apache Hive : WebHCat Reference PutTableLike
Description
Create a new HCatalog table like an existing one.
URL
http://www.myserver.com/templeton/v1/ddl/database/:db/table/:existingtable/like/:newtable
Parameters
| Name | Description | Required? | Default |
|---|---|---|---|
| :db | The database name | Required | None |
| :existingtable | The existing table name | Required | None |
| :newtable | The new table name | Required | None |
| group | The user group to use when creating a table | Optional | None |
| permissions | The permissions string to use when creating a table | Optional | None |
| external | Allows you to specify a location so that Hive does not use the default location for this table. | Optional | false |
| ifNotExists | If true, you will not receive an error if the table already exists. | Optional | false |
| location | The HDFS path | Optional | None |
The standard parameters are also supported.
Apache Hive : WebHCat Reference ResponseTypes
Dec 12, 2024Apache Hive : WebHCat Reference ResponseTypes
Description
Returns a list of the response types supported by WebHCat (Templeton).
URL
http://www.myserver.com/templeton/:version
Parameters
| Name | Description | Required? | Default |
|---|---|---|---|
| :version | The WebHCat version number. (Currently this must be “v1”.) | Required | None |
The standard parameters are also supported.
Results
| Name | Description |
|---|---|
| responseTypes | A list of all supported response types |
Example
Curl Command
% curl -s 'http://localhost:50111/templeton/v1'
JSON Output
{
"responseTypes": [
"application/json"
]
}
JSON Output (error)
{
"error": "null for uri: http://localhost:50111/templeton/v2"
}
Navigation Links
Previous: Reference: WebHCat Resources
Next: GET status
Apache Hive : WebHCat Reference Status
Dec 12, 2024Apache Hive : WebHCat Reference Status
Description
Returns the current status of the WebHCat (Templeton) server. Useful for heartbeat monitoring.
URL
http://www.myserver.com/templeton/v1/status
Parameters
Only the standard parameters are accepted.
Results
| Name | Description |
|---|---|
| status | “ok” if the WebHCat server was contacted. |
| version | String containing the version number similar to “v1”. |
Example
Curl Command
% curl -s 'http://localhost:50111/templeton/v1/status'
JSON Output
{
"status": "ok",
"version": "v1"
}
Navigation Links Previous: Response Types (GET :version)Next: GET version
Apache Hive : WebHCat Reference Version
Dec 12, 2024Apache Hive : WebHCat Reference Version
Description
Returns a list of supported versions and the current version.
URL
http://www.myserver.com/templeton/v1/version
Parameters
Only the standard parameters are accepted.
Results
| Name | Description |
|---|---|
| supportedVersions | A list of all supported versions. |
| version | The current version. |
Example
Curl Command
% curl -s 'http://localhost:50111/templeton/v1/version'
JSON Output
{
"supportedVersions": [
"v1"
],
"version": "v1"
}
Navigation Links
Previous: GET status
Next: GET version/hive
Apache Hive : WebHCat Reference VersionHadoop
Dec 12, 2024Apache Hive : WebHCat Reference VersionHadoop
Description
Return the version of Hadoop being run when WebHCat creates a MapReduce job (POST mapreduce/jar or mapreduce/streaming).
Version: Hive 0.13.0 and later
GET version/hadoop is introduced in Hive release 0.13.0 (HIVE-6226).
URL
http://www.myserver.com/templeton/v1/version/hadoop
Parameters
Only the standard parameters are accepted.
Results
Returns the Hadoop version.
Example
Curl Command
% curl -s 'http://localhost:50111/templeton/v1/version/hadoop?user.name=ekoifman'
JSON Output
[
{"module":"hadoop","version":"2.4.1-SNAPSHOT}
]
Navigation Links
Previous: GET version/hive
Next: POST ddl
Replaces deprecated resource: GET queue
Apache Hive : WebHCat Reference VersionHive
Dec 12, 2024Apache Hive : WebHCat Reference VersionHive
Description
Return the version of Hive being run when WebHCat issues Hive queries or commands (POST hive).
Version: Hive 0.13.0 and later
GET version/hive is introduced in Hive release 0.13.0 (HIVE-6226).
URL
http://www.myserver.com/templeton/v1/version/hive
Parameters
Only the standard parameters are accepted.
Results
Returns the Hive version.
Example
Curl Command
% curl -s 'http://localhost:50111/templeton/v1/version/hive?user.name=ekoifman'
JSON Output
[
{"module":"hive","version":"0.14.0-SNAPSHOT"}
]
Navigation Links
Previous: GET version
Next: GET version/hadoop
Replaces deprecated resource: GET queue
Apache Hive : WebHCat UsingWebHCat
Dec 12, 2024Apache Hive : WebHCat UsingWebHCat
Version information
The HCatalog project graduated from the Apache incubator and merged with the Hive project on March 26, 2013.
Hive version 0.11.0 is the first release that includes HCatalog and its REST API, WebHCat.
Introduction to WebHCat
This document describes the HCatalog REST API, WebHCat, which was previously called Templeton.
As shown in the figure below, developers make HTTP requests to access Hadoop MapReduce (or YARN), Pig, Hive, and HCatalog DDL from within applications. Data and code used by this API are maintained in HDFS. HCatalog DDL commands are executed directly when requested. MapReduce, Pig, and Hive jobs are placed in queue by WebHCat (Templeton) servers and can be monitored for progress or stopped as required. Developers specify a location in HDFS into which Pig, Hive, and MapReduce results should be placed.