博客从Linode搬到了HOSTUS

这个博客搬家了好多次。今天有从Linode搬到了HOSTUS,开了几次PPTP的VPN,然后貌似被GFW盯上了,狂丢包,找Linode申请换了IP,是好了,但是想想Linode 10刀一个月,是有点贵。索性就换到了HOSTUS的HK节点。速度也不错。配置也够用,关键还比价便宜,35刀一年。装了shadowsocks,翻墙速度还不错。就不装PPTP vpn了,HOSTUS有个不好的是不给换IP,给GFW盯上就废了。

趁着这次搬迁,顺便调整了一下各个应用的版本和配置。。

  • Nginx升级到了1.9.7
  • 全站强制SSL
  • 支持 HSTS (HTTP Strict Transport Security)
  • 支持 Certificate Transparency
  • 支持OCSP Stapling(Online Certificate Status Protocol)
  • 启用了HTTP 2.0,也兼容HTTP 1.1
  • 去掉对不安全SSL协议和Cipher的支持
  • PHP升级到5.6.15
  • 启用了Xcache
  • 免费SSL证书提供商从Startssl.com 换到了letsencrypt.org
  • DNS Server从dnspod换回了Godaddy
  • WordPress上用Updraftplus 每日将数据备份到Google Drive

NoSQL 数据库的分布式算法

Distributed Algorithms in NoSQL Databases 

 

Scalability is one of the main drivers of the NoSQL movement. As such, it encompasses distributed system coordination, failover, resource management and many other capabilities. It sounds like a big umbrella, and it is. Although it can hardly be said that NoSQL movement brought fundamentally new techniques into distributed data processing, it triggered an avalanche of practical studies and real-life trials of different combinations of protocols and algorithms. These developments gradually highlight a system of relevant database building blocks with proven practical efficiency. In this article I’m trying to provide more or less systematic description of techniques related to distributed operations in NoSQL databases.

In the rest of this article we study a number of distributed activities like replication of failure detection that could happen in a database. These activities, highlighted in bold below, are grouped into three major sections:

  • Data Consistency. Historically, NoSQL paid a lot of attention to tradeoffs between consistency, fault-tolerance and performance to serve geographically distributed systems, low-latency or highly available applications. Fundamentally, these tradeoffs spin around data consistency, so this section is devoted?data replication?and?data repair.
  • Data Placement. A database should accommodate itself to different data distributions, cluster topologies and hardware configurations. In this section we discuss how todistribute or rebalance data?in such a way that failures are handled rapidly, persistence guarantees are maintained, queries are efficient, and system resource like RAM or disk space are used evenly throughout the cluster.
  • System Coordination. Coordination techniques like?leader election?are used in many databases to implements fault-tolerance and strong data consistency. However, even decentralized databases typically track their global state,?detect failures and topology changes. This section describes several important techniques that are used to keep the system in a coherent state.

继续阅读“NoSQL 数据库的分布式算法”

使用ORACLE GATEWAY 与Greenplum HAWQ 1.0 互通

现场环境:Oracle 11g 64bit linux edition

Greenplum Pivotal HD HAWQ 1.0

操作系统 RHEL 6.3 x86_64

为了实现在Oracle中直接访问HAWQ中的数据,需要首先安装Oracle Gateway 11g x86_64 linux edition

 

需要安装的软件:Oracle Gateway 11g x86_64 linux edition

DATADIRECT_CONNECT64_ODBC_7.0.1for Greenplum (必须是7版本的,6的不行,rhel6带的unixodbc也不行)

 

( 后记,Oracle 11g database server 中貌似也带了hs的功能,直接使用database server 当gateway应也可,不用再单独安装gateway)

安装路径: Oracle Gateway:/home/oracle/product/11.2.0/tg_1

Datadirect ODBC :/home/oracle/ddodbc

Oracle database :/home/oracle/app/oracle/product/11.2.0/dbhome_1

 

配置:

修改/home/oracle/product/11.2.0/tg_1/hs/admin/initdg4odbc.ora

# This is a sample agent init file that contains the HS parameters that are
# needed for the Database Gateway for ODBC
#
# HS init parameters
#
HS_FDS_CONNECT_INFO = Greenplum
#HS_FDS_TRACE_LEVEL = 0
HS_FDS_TRACE_LEVEL= ODBC
HS_FDS_TRACE_FILE_NAME = /home/oracle/odbc_trace.log
HS_FDS_SHAREABLE_NAME = /usr/lib64/libodbc.so
HS_FDS_SUPPORT_STATISTICS = FALSE
#HS_LANGUAGE=AMERICAN_AMERICA.WE8ISO8859P1
HS_LANGUAGE = AMERICAN_AMERICA.UTF8
HS_FDS_TIMESTAMP_MAPPING = “TIMESTAMP(6)”
HS_FDS_FETCH_ROWS=1
#HS_FDS_SQLLEN_INTERPRETATION=32
HS_KEEP_REMOTE_COLUMN_SIZE=LOCAL
HS_NLS_NCHAR = UCS2
#
# ODBC specific environment variables
#
set ODBCINI=/etc/odbc.ini
set ODBCINST=/etc/odbcinst.ini

修改/home/oracle/app/oracle/product/11.2.0/dbhome_1/network/admin/listener.ora

备注,此处直接使用database的listener。未使用gateway的listener ,在文件后追加:

SID_LIST_LISTENER =
        (SID_LIST =
                (SID_DESC =
                        (SID_NAME = dg4odbc)
                        (ORACLE_HOME =/home/oracle/product/11.2.0/tg_1 )
                        (ENV=”LD_LIBRARY_PATH=/usr/local/lib:/home/oracle/product/11.2.0/tg_1/lib:/home/oracle/ddodbc/lib”)
                        (PROGRAM = dg4odbc)
                )
        )
)
修改/home/oracle/app/oracle/product/11.2.0/dbhome_1/network/admin/tnsnames.ora
增加:
dg4odbc  =
  (DESCRIPTION=
    (ADDRESS=(PROTOCOL=tcp)(HOST=localhost)(PORT=1521))
    (CONNECT_DATA=(SID=dg4odbc))
    (HS=OK)
  )
修改/etc/odbc.ini
[oracle@server3 admin]$ cat /etc/odbc.ini
[ODBC Data Sources]
Greenplum=Greenplum
[ODBC Data Sources]
Greenplum Wire Protocol=DataDirect 7.0 Greenplum Wire Protocol
[ODBC]
IANAAppCodePage=4
InstallDir=/home/oracle/ddodbc
Trace=0
TraceFile=odbctrace.out
TraceDll=/home/oracle/ddodbc/lib/ddtrc26.so
[Greenplum]
Driver=/home/oracle/ddodbc/lib/ddgplm26.so
Description=DataDirect 7.0 Greenplum Wire Protocol
AlternateServers=
ApplicationUsingThreads=1
ConnectionReset=0
ConnectionRetryCount=0
ConnectionRetryDelay=3
Database=haier
DefaultLongDataBuffLen=2048
EnableDescribeParam=0
EnableKeysetCursors=0
EncryptionMethod=0
ExtendedColumnMetadata=0
FailoverGranularity=0
FailoverMode=0
FailoverPreconnect=0
FetchTSWTZasTimestamp=0
FetchTWFSasTime=0
HostName=server3
InitializationString=
KeyPassword=
KeysetCursorOptions=0
KeyStore=
KeyStorePassword=
LoadBalanceTimeout=0
LoadBalancing=0
LoginTimeout=15
LogonID=
MaxPoolSize=100
MinPoolSize=0
Password=
Pooling=0
PortNumber=54321
QueryTimeout=0
ReportCodepageConversionErrors=0
TransactionErrorBehavior=1
XMLDescribeType=-10
Charset=utf8
[ODBC]
TraceFile=/tmp/sql.log
Trace=1

 

 

修改/etc/odbcinst.ini

[ODBC Drivers]
DataDirect 7.0 Greenplum Wire Protocol=Installed
[ODBC Translators]
OEM to ANSI=Installed
[Administrator]
HelpRootDirectory=/home/oracle/ddodbc/adminhelp
[ODBC]
#This section must contain values for DSN-less connections
#if no odbc.ini file exists. If an odbc.ini file exists,
#the values from that [ODBC] section are used.
[DataDirect 7.0 Greenplum Wire Protocol]
Driver=/home/oracle/ddodbc/lib/ddgplm26.so
Setup=/home/oracle/ddodbc/lib/ddgplm26.so
APILevel=0
ConnectFunctions=YYY
DriverODBCVer=3.52
FileUsage=0
HelpRootDirectory=/home/oracle/ddodbc/help
SQLLevel=0

 

重启listener

lsnrctl status 中应有一下内容:

Listening Endpoints Summary…
  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC1521)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=server3)(PORT=1521)))
Services Summary…
Service “dg4odbc” has 1 instance(s).
  Instance “dg4odbc”, status UNKNOWN, has 1 handler(s) for this service…
而后在数据库中创建dblink,注意用户名和密码需要使用双引号

create database link DBLINK
connect to “gpadmin” identified by “gpadmin”
using ‘dg4odbc’;

执行sql语句即可返回成功(注意HAWQ中的表名和字段名需要使用双引号,用小写表示):

select count(*) from “poc1″@dblink;

 

HADOOP 的安全性

背景

当Hadoop在2004年开始开发的时候,对如何创建一个安全的分布式计算式环境上没有考虑,Hadoop框架对用户及服务的验证和授权严重不足,用户可以仿冒任意一个HDFS和mapreduce上的用户,恶意的代码可以以任何一个用户提交到Job tracker,尽管在版本0.16以后, HDFS增加了文件和目录的权限,但由于用户无需认证,HDFS的权限控制也极其容易绕过,允许一个用户伪装成任意一个用户,同时计算框架也没有进行双向验证,一个恶意的网络用户可以模拟一个正常的集群服务加入到hadoop集群,去接受JobTracker, Namenode的任务指派。

此时HADOOP要安全只能靠严格的的网络隔离,能访问hadoop集群的用户都是完全可信赖的用户。

到了2009年,关于hadoop安全性的讨论已经接近白热化了,安全被作为一个高优先级的问题摆了出来,Hadoop 2010年的开发计划就包含的用户与服务的双向认证,并且对终端用户透明。具体可参见【HADOOP-4487 Security features for Hadoop7】

安全设计时有以下6大目标:

1.2 Requirements

1. Users are only allowed to access HDFS files that they have permission to access.

2. Users are only allowed to access or modify their own MapReduce jobs.

3. User to service mutual authentication to prevent unauthorized NameNodes, DataNodes, JobTrackers, or TaskTrackers.

4. Service to service mutual authentication to prevent unauthorized services from joining a cluster’s HDFS or MapReduce service.

5. The acquisition and use of Kerberos credentials will be transparent to the user and applications, provided that the operating system acquired a Kerberos Ticket Granting Tickets (TGT) for the user at login.

6. The degradation of GridMix performance should be no more than 3%.

为了达到这目标,开发者决定使用kerberos进行认证。选择kerberos而非SSL基于以下理由:

1. kerberos使用对称密钥可以达到更高的性能,比使用公钥的SSL更快。

2. 简化了用户管理。例如,取消一个用户只需要简单的将用户从KDC里面删除即可,而如果使用SSL就需要生成证书注销列表(CRL) 并分发到所有服务器。

Hadoop的安全设计有以下七大假设

1.For backwards compatibility and single-user clusters, it will be possible to configure the cluster with the current style of security.

2. Hadoop itself does not issue user credentials or create accounts for users. Hadoop depends on external user credentials (e.g. OS login, Kerberos credentials, etc). Users are expected to acquire those credentials from Kerberos at operating system login. Hadoop services should also be configured with suitable credentials depending on the cluster setup to authenticate with each other.

3. Each cluster is set up and configured independently. To access multiple clusters, a client needs to authenticate to each cluster separately. However, a single sign on that acquires a Kerberos ticket will work on all appropriate clusters.

4. Users will not have access to root accounts on the cluster or on the machines that are used to launch jobs.

5. HDFS and MapReduce communication will not travel on untrusted networks.

6. A Hadoop job will run no longer than 7 days (configurable) on a MapReduce cluster or accessing HDFS from the job will fail.

7. Kerberos tickets will not be stored in MapReduce jobs and will not be available to the job’s tasks. Access to HDFS will be authorized via delegation tokens as explained in section 4.1.

以上七个假设基本就是Hadoop安全设计的七大安全缺陷。了解以后也就可以更有针对的对安全问题进行防范。

为了保证开发进度以及保证向后兼容,开发者在设计过程中对安全设计进行了部分妥协,首先新的设计要求终端用户不能在集群中的任何机器上拥有高权限的管理帐户,例如root。如果终端用户对集群中的机器拥有root权限,他就可以发现 Delegation Tokens, Job Tokens, Block Access Tokens或者symmetric encryption keys ,进而颠覆整个系统的安全保证。同时在开发过程中决定保证安全的新功能对于集群性能的影响不大于3%,这导致的开发者倾向使用对称加密算法,并且对网络传输不加密。

 部署hadoop kerberos方案后的安全顾虑

在hadoop新的安全设计中使用的 “threat model” 引起了一些新的安全性上的考虑。首先是SASL (Simple Authenticationand Security Layer)默认提供的安全保护,广泛分发的对称密钥。二是因为强调性能以及加密耗费资源的前提假设,hadoop使用了非常低的默认SASL  QoP (Quality of Protection),尽管SASL over kerberos其实可以做到更好。但hadoop的默认QoP只是进行认证,对于网络通信的完整性以及加密功能并未提供。Hadoop的PRC通信就可以被窃听及修改。同时新的Hadoop 安全设计依赖于HMAC-SHA1 这个对称密钥加密算法。因为Block Access Token 需要,HMAC-SHA1的对称密钥需要分发到Namenode以及所有的datanode机器,这意味着成百上千台的服务器。如果密钥被攻击者发现那么所有datanode上的数据都极其容易受到攻击。只要知道DATAID,攻击者就可以仿冒一个Block Access Token ,进而将hadoop的安全行降低到上一个版本的水平。而且HDFS Proxy使用的是基于IP的认证。一旦获得了HDFS Proxy Server的访问权,就可以读取HDFS Proxy Server 所被授权的所有数据。

二.CDH 4 mrv1 与kerberos集成

测试使用版本如下:

kerberos server 版本:krb5-server-1.9-33.el6.x86_64

Hadoop 版本: CDH4 W/ MRV1

操作系统版本:Redhat Enterprise Linux 6 update 3 64bit

JAVA版本:jdk-6u35-linux-x64

 

服务器名 IP 用途
rhel227.hadoop.hylandtec.com 172.16.130.227 Kerberos 服务器
rhel63.hadoop.hylandtec.com 172.16.130.234 Hadoop服务器

两台服务器均已经在DNS server上配置好正向及反向解析。hadoop.hylandtec.com 指向kerberos服务器。

安装kerberos服务器

作为测试,仅使用kerberos作为认证服务器,安装系统盘中的krb5-server-1.9-33.el6.x86_64.rpm,krb5-libs-1.9-33.el6.x86_64, krb5-workstation-1.9-33.el6.x86_64

kerberos realm 我们规划为HADOOP.HYLANDTEC.COM,以下配置中注意realm 必须为大写!

编辑/etc/krb5.conf

[logging]

default = FILE:/var/log/krb5libs.log

kdc = FILE:/var/log/krb5kdc.log

admin_server = FILE:/var/log/kadmind.log

[libdefaults]

default_realm = HADOOP.HYLANDTEC.COM

dns_lookup_realm = false

dns_lookup_kdc = false

ticket_lifetime = 24h

renew_lifetime = 7d

forwardable = true

[realms]

HADOOP.HYLANDTEC.COM = {

kdc = rhel227.hadoop.hylandtec.com:88

admin_server = rhel227.hadoop.hylandtec.com:749

default_domain = HADOOP.HYLANDTEC.COM  }

[domain_realm]

.hadoop.hylandtec.com = HADOOP.HYLANDTEC.COM

hadoop.hylandtec.com = HADOOP.HYLANDTEC.COM

[kdc]

profile = /var/kerberos/krb5kdc/kdc.conf

以root用户执行,按照提数输入master key:

 kdb5_util create -s

编辑/var/kerberos/krb5kdc/kdc.conf 文件。设置如下:

[kdcdefaults]

kdc_ports = 88

kdc_tcp_ports = 88

[realms]

HADOOP.HYLANDTEC.COM = {

#master_key_type = aes256-cts

max_renewable_life = 7d

max_life = 2d 0h 0m 0s

default_principal_flags = +renewable

krbMaxTicketLife = 172800

krbMaxRenewableAge = 604800

acl_file = /var/kerberos/krb5kdc/kadm5.acl

dict_file = /usr/share/dict/words

admin_keytab = /var/kerberos/krb5kdc/kadm5.keytab

supported_enctypes =  aes128-cts:normal des3-hmac-sha1:normal arcfour-hmac:normal des-hmac-sha1:normal des-cbc-md5:normal des-cbc-crc:normal

}

编辑/var/kerberos/krb5kdc/kadm5.acl 文件,配置如下:

*/[email protected]    *

执行以下命令,新增root用户

kadmin.local  -q “addprinc root/[email protected]

配置完成后执行以下命令启动kerberos服务:

service krb5kdc start

service kadmin start

经配置好的/etc/krb5.conf 文件分发至hadoop各台机器即可。kerberos服务器安装完成。

 

HADOOP kerberos 配置

按照官方文档配置即可,没有问题。

https://ccp.cloudera.com/display/CDH4DOC/Configuring+Hadoop+Security+in+CDH4

有时候kerberos出错的提示很不明朗的时候。设置如下环境变量很有帮助:

export HADOOP_OPTS=”-Dsun.security.krb5.debug=true”

参考资料:

【1】 Hadoop Securit Design  

【2】Hadoop Security Design Just Add Kerberos? Really?

【3】Hadoop Security Overview

【4】Hadoop Kerberos安全机制介绍

【5】Configuring Hadoop Security in CDH4

 

RFC 3961

Kerberos ETypes Kerberos加密类型代码

 

	encryption type                etype      section or comment
      -----------------------------------------------------------------
      des-cbc-crc                        1             6.2.3
      des-cbc-md4                        2             6.2.2
      des-cbc-md5                        3             6.2.1
      [reserved]                         4
      des3-cbc-md5                       5
      [reserved]                         6
      des3-cbc-sha1                      7
      dsaWithSHA1-CmsOID                 9           (pkinit)
      md5WithRSAEncryption-CmsOID       10           (pkinit)
      sha1WithRSAEncryption-CmsOID      11           (pkinit)
      rc2CBC-EnvOID                     12           (pkinit)
      rsaEncryption-EnvOID              13   (pkinit from PKCS#1 v1.5)
      rsaES-OAEP-ENV-OID                14   (pkinit from PKCS#1 v2.0)
      des-ede3-cbc-Env-OID              15           (pkinit)
      des3-cbc-sha1-kd                  16              6.3
      aes128-cts-hmac-sha1-96           17          [KRB5-AES]
      aes256-cts-hmac-sha1-96           18          [KRB5-AES]
      rc4-hmac                          23          (Microsoft)
      rc4-hmac-exp                      24          (Microsoft)
      subkey-keymaterial                65     (opaque; PacketCable)

 

 

hadoop 0.20.2 capacity scheduler 配置方法

本文采用的是CDH4+MAPREDUCE 0.20

hadoop中共有6台机作为tasktracker ,每台机配置map和reduce个2个slots

队列名 Capacity Maximum  Capacity
 defaults  10%
 edp  50%  90%
 hive  40%  80%

设置完资源后,设置队列的ACL,必须具有相关权限才能向指定队列中提交任务。

队列名 权限
 defaults 所有用户可提交
 edp xhdeng可提交
 hive root及hive用户可提交

登录jobtracker机器

将/usr/lib/hadoop-0.20-mapreduce/contrib/capacity-scheduler/下的hadoop-capacity-scheduler-2.0.0-mr1-cdh4.0.1.jar 复制到/usr/lib/hadoop/lib/目录下

修改/etc/hadoop/conf 下的mapred-site.xml 在其中新增

<property>
<name>mapred.jobtracker.taskScheduler</name>
<value>org.apache.hadoop.mapred.CapacityTaskScheduler</value>
</property>

<property>
<name>mapred.queue.names</name>
<value>default,edp,hive</value>
</property>

<property>
<name>mapred.acls.enabled</name>
<value>true</value>
<description> Specifies whether ACLs should be checked
for authorization of users for doing various queue and job level operations.
ACLs are disabled by default. If enabled, access control checks are made by
JobTracker and TaskTracker when requests are made by users for queue
operations like submit job to a queue and kill a job in the queue and job
operations like viewing the job-details (See mapreduce.job.acl-view-job)
or for modifying the job (See mapreduce.job.acl-modify-job) using
Map/Reduce APIs, RPCs or via the console and web user interfaces.
</description>
</property>

在/etc/hadoop/conf 下新建capacity-scheduler.xml

<?xml version=”1.0″?>

<!– This is the configuration file for the resource manager in Hadoop. –>
<!– You can configure various scheduling parameters related to queues. –>
<!– The properties for a queue follow a naming convention,such as, –>
<!– mapred.capacity-scheduler.queue.<queue-name>.property-name. –>

<configuration>

<property>
<name>mapred.capacity-scheduler.maximum-system-jobs</name>
<value>6</value>
<description>Maximum number of jobs in the system which can be initialized,
concurrently, by the CapacityScheduler.
</description>
</property>

<property>
<name>mapred.capacity-scheduler.queue.default.capacity</name>
<value>10</value>
<description>Percentage of the number of slots in the cluster that are
to be available for jobs in this queue.
</description>
</property>

<property>
<name>mapred.capacity-scheduler.queue.default.maximum-capacity</name>
<value>-1</value>
<description>
maximum-capacity defines a limit beyond which a queue cannot use the capacity of the cluster.
This provides a means to limit how much excess capacity a queue can use. By default, there is no limit.
The maximum-capacity of a queue can only be greater than or equal to its minimum capacity.
Default value of -1 implies a queue can use complete capacity of the cluster.

This property could be to curtail certain jobs which are long running in nature from occupying more than a
certain percentage of the cluster, which in the absence of pre-emption, could lead to capacity guarantees of
other queues being affected.

One important thing to note is that maximum-capacity is a percentage , so based on the cluster’s capacity
the max capacity would change. So if large no of nodes or racks get added to the cluster , max Capacity in
absolute terms would increase accordingly.
</description>
</property>

<property>
<name>mapred.capacity-scheduler.queue.default.supports-priority</name>
<value>false</value>
<description>If true, priorities of jobs will be taken into
account in scheduling decisions.
</description>
</property>

<property>
<name>mapred.capacity-scheduler.queue.default.minimum-user-limit-percent</name>
<value>100</value>
<description> Each queue enforces a limit on the percentage of resources
allocated to a user at any given time, if there is competition for them.
This user limit can vary between a minimum and maximum value. The former
depends on the number of users who have submitted jobs, and the latter is
set to this property value. For example, suppose the value of this
property is 25. If two users have submitted jobs to a queue, no single
user can use more than 50% of the queue resources. If a third user submits
a job, no single user can use more than 33% of the queue resources. With 4
or more users, no user can use more than 25% of the queue’s resources. A
value of 100 implies no user limits are imposed.
</description>
</property>

<property>
<name>mapred.capacity-scheduler.queue.default.user-limit-factor</name>
<value>1</value>
<description>The multiple of the queue capacity which can be configured to
allow a single user to acquire more slots.
</description>
</property>

<property>
<name>mapred.capacity-scheduler.queue.default.maximum-initialized-active-tasks</name>
<value>200000</value>
<description>The maximum number of tasks, across all jobs in the queue,
which can be initialized concurrently. Once the queue’s jobs exceed this
limit they will be queued on disk.
</description>
</property>

<property>
<name>mapred.capacity-scheduler.queue.default.maximum-initialized-active-tasks-per-user</name>
<value>100000</value>
<description>The maximum number of tasks per-user, across all the of the
user’s jobs in the queue, which can be initialized concurrently. Once the
user’s jobs exceed this limit they will be queued on disk.
</description>
</property>

<property>
<name>mapred.capacity-scheduler.queue.default.init-accept-jobs-factor</name>
<value>10</value>
<description>The multipe of (maximum-system-jobs * queue-capacity) used to
determine the number of jobs which are accepted by the scheduler.
</description>
</property>

<!– The default configuration settings for the capacity task scheduler –>
<!– The default values would be applied to all the queues which don’t have –>
<!– the appropriate property for the particular queue –>
<property>
<name>mapred.capacity-scheduler.default-supports-priority</name>
<value>false</value>
<description>If true, priorities of jobs will be taken into
account in scheduling decisions by default in a job queue.
</description>
</property>

<property>
<name>mapred.capacity-scheduler.default-minimum-user-limit-percent</name>
<value>100</value>
<description>The percentage of the resources limited to a particular user
for the job queue at any given point of time by default.
</description>
</property>
<property>
<name>mapred.capacity-scheduler.default-user-limit-factor</name>
<value>1</value>
<description>The default multiple of queue-capacity which is used to
determine the amount of slots a single user can consume concurrently.
</description>
</property>

<property>
<name>mapred.capacity-scheduler.default-maximum-active-tasks-per-queue</name>
<value>200000</value>
<description>The default maximum number of tasks, across all jobs in the
queue, which can be initialized concurrently. Once the queue’s jobs exceed
this limit they will be queued on disk.
</description>
</property>

<property>
<name>mapred.capacity-scheduler.default-maximum-active-tasks-per-user</name>
<value>100000</value>
<description>The default maximum number of tasks per-user, across all the of
the user’s jobs in the queue, which can be initialized concurrently. Once
the user’s jobs exceed this limit they will be queued on disk.
</description>
</property>

<property>
<name>mapred.capacity-scheduler.default-init-accept-jobs-factor</name>
<value>10</value>
<description>The default multipe of (maximum-system-jobs * queue-capacity)
used to determine the number of jobs which are accepted by the scheduler.
</description>
</property>

<!– Capacity scheduler Job Initialization configuration parameters –>
<property>
<name>mapred.capacity-scheduler.init-poll-interval</name>
<value>5000</value>
<description>The amount of time in miliseconds which is used to poll
the job queues for jobs to initialize.
</description>
</property>
<property>
<name>mapred.capacity-scheduler.init-worker-threads</name>
<value>5</value>
<description>Number of worker threads which would be used by
Initialization poller to initialize jobs in a set of queue.
If number mentioned in property is equal to number of job queues
then a single thread would initialize jobs in a queue. If lesser
then a thread would get a set of queues assigned. If the number
is greater then number of threads would be equal to number of
job queues.
</description>
</property>

<property>
<name>mapred.capacity-scheduler.queue.hive.capacity</name>
<value>40</value>
</property>
<property>
<name>mapred.capacity-scheduler.queue.hive.maximum-capacity</name>
<value>80</value>
</property>
<property>
<name>mapred.capacity-scheduler.queue.hive.supports-priority</name>
<value>false</value>
</property>
<property>
<name>mapred.capacity-scheduler.queue.hive.minimum-user-limit-percent</name>
<value>20</value>
</property>
<property>
<name>mapred.capacity-scheduler.queue.hive.user-limit-factor</name>
<value>10</value>
</property>
<property>
<name>mapred.capacity-scheduler.queue.hive.maximum-initialized-active-tasks</name>
<value>200000</value>
</property>
<property>
<name>mapred.capacity-scheduler.queue.hive.maximum-initialized-active-tasks-per-user</name>
<value>100000</value>
</property>
<property>
<name>mapred.capacity-scheduler.queue.hive.init-accept-jobs-factor</name>
<value>100</value>
</property>

<!– queue: edp –>
<property>
<name>mapred.capacity-scheduler.queue.edp.capacity</name>
<value>50</value>
</property>
<property>
<name>mapred.capacity-scheduler.queue.edp.maximum-capacity</name>
<value>90</value>
</property>
<property>
<name>mapred.capacity-scheduler.queue.edp.supports-priority</name>
<value>false</value>
</property>
<property>
<name>mapred.capacity-scheduler.queue.edp.minimum-user-limit-percent</name>
<value>100</value>
</property>
<property>
<name>mapred.capacity-scheduler.queue.edp.user-limit-factor</name>
<value>1</value>
</property>
<property>
<name>mapred.capacity-scheduler.queue.edp.maximum-initialized-active-tasks</name>
<value>200000</value>
</property>
<property>
<name>mapred.capacity-scheduler.queue.edp.maximum-initialized-active-tasks-per-user</name>
<value>100000</value>
</property>
<property>
<name>mapred.capacity-scheduler.queue.edp.init-accept-jobs-factor</name>
<value>10</value>
</property>

</configuration>

在/etc/hadoop/conf 下新建mapred-queue-acls.xml

<configuration>
<property>
<name>mapred.queue.edp.acl-submit-job</name>
<value>xhdeng</value>
<description> Comma separated list of user and group names that are allowed
to submit jobs to the ‘default’ queue. The user list and the group list
are separated by a blank. For e.g. user1,user2 group1,group2.
If set to the special value ‘*’, it means all users are allowed to
submit jobs. If set to ‘ ‘(i.e. space), no user will be allowed to submit
jobs.

It is only used if authorization is enabled in Map/Reduce by setting the
configuration property mapred.acls.enabled to true.
Irrespective of this ACL configuration, the user who started the cluster and
cluster administrators configured via
mapreduce.cluster.administrators can submit jobs.
</description>
</property>
<property>
<name>mapred.queue.hive.acl-submit-job</name>
<value>hive,root</value>
<description> Comma separated list of user and group names that are allowed
to submit jobs to the ‘default’ queue. The user list and the group list
are separated by a blank. For e.g. user1,user2 group1,group2.
If set to the special value ‘*’, it means all users are allowed to
submit jobs. If set to ‘ ‘(i.e. space), no user will be allowed to submit
jobs.

It is only used if authorization is enabled in Map/Reduce by setting the
configuration property mapred.acls.enabled to true.
Irrespective of this ACL configuration, the user who started the cluster and
cluster administrators configured via
mapreduce.cluster.administrators can submit jobs.
</description>
</property>
<property>
<name>mapred.queue.default.acl-submit-job</name>
<value>*</value>
<description> Comma separated list of user and group names that are allowed
to submit jobs to the ‘default’ queue. The user list and the group list
are separated by a blank. For e.g. user1,user2 group1,group2.
If set to the special value ‘*’, it means all users are allowed to
submit jobs. If set to ‘ ‘(i.e. space), no user will be allowed to submit
jobs.

It is only used if authorization is enabled in Map/Reduce by setting the
configuration property mapred.acls.enabled to true.
Irrespective of this ACL configuration, the user who started the cluster and
cluster administrators configured via
mapreduce.cluster.administrators can submit jobs.
</description>
</property>
</configuration>

重启jobtracker,登录hadoop的map/reduce administration即可发现新的调度器生效。

 

Scheduling Information

Queue Name State Scheduling Information
hive running Queue configuration
Capacity Percentage: 40.0%
User Limit: 20%
Priority Supported: NO
————-
Map tasks
Capacity: 4 slots
Maximum capacity: 9 slots
Used capacity: 0 (0.0% of Capacity)
Running tasks: 0
————-
Reduce tasks
Capacity: 4 slots
Maximum capacity: 9 slots
Used capacity: 0 (0.0% of Capacity)
Running tasks: 0
————-
Job info
Number of Waiting Jobs: 0
Number of Initializing Jobs: 0
Number of users who have submitted jobs: 0
default running Queue configuration
Capacity Percentage: 10.0%
User Limit: 100%
Priority Supported: NO
————-
Map tasks
Capacity: 1 slots
Used capacity: 0 (0.0% of Capacity)
Running tasks: 0
————-
Reduce tasks
Capacity: 1 slots
Used capacity: 0 (0.0% of Capacity)
Running tasks: 0
————-
Job info
Number of Waiting Jobs: 0
Number of Initializing Jobs: 0
Number of users who have submitted jobs: 0
edp running Queue configuration
Capacity Percentage: 50.0%
User Limit: 100%
Priority Supported: NO
————-
Map tasks
Capacity: 6 slots
Maximum capacity: 10 slots
Used capacity: 0 (0.0% of Capacity)
Running tasks: 0
————-
Reduce tasks
Capacity: 6 slots
Maximum capacity: 10 slots
Used capacity: 0 (0.0% of Capacity)
Running tasks: 0
————-
Job info
Number of Waiting Jobs: 0
Number of Initializing Jobs: 0
Number of users who have submitted jobs: 0

 

如果修改了调度器的配置文件,无需重启整个jobtracker,使用以下命令刷新即可。

hadoop mradmin -refreshQueues

查看当前用户的acl可以使用以下命令查看。

mapred queue -showacls

Queue acls for user : root

Queue Operations
=====================
hive submit-job,administer-jobs
default submit-job,administer-jobs
edp administer-jobs

用root用户往edp队列中跑一个任务测试一下:

sudo  hadoop jar    hadoop-mapreduce-client-jobclient-2.0.0-cdh4.0.1-tests.jar TestDFSIO -D mapred.job.queue.name=edp  -write -nrFiles 6 -fileSize 1000

然后,必然的,报错了。

12/11/23 16:16:19 ERROR security.UserGroupInformation: PriviledgedActionException as:root (auth:SIMPLE) cause:org.apache.hadoop.security.AccessControlException: User root cannot perform operation SUBMIT_JOB on queue edp.
Please run “hadoop queue -showacls” command to find the queues you have access to .

用root用户往edp用户中跑一个任务测试一下:

sudo  hadoop jar    hadoop-mapreduce-client-jobclient-2.0.0-cdh4.0.1-tests.jar TestDFSIO  -write -nrFiles 6 -fileSize 1000

然后,必然的,报错了。

default running Queue configuration
Capacity Percentage: 10.0%
User Limit: 100%
Priority Supported: NO
————-
Map tasks
Capacity: 1 slots
Used capacity: 1 (100.0% of Capacity)
Running tasks: 1
Active users:
User ‘root’: 1 (100.0% of used capacity)
————-
Reduce tasks
Capacity: 1 slots
Used capacity: 0 (0.0% of Capacity)
Running tasks: 0
————-
Job info
Number of Waiting Jobs: 0
Number of Initializing Jobs: 0
Number of users who have submitted jobs: 1

跑起来了,很悲催的是只能使用1个slot。。。

那往hive队列上再调一次。。

 

Queue Name State Scheduling Information
hive running Queue configuration
Capacity Percentage: 40.0%
User Limit: 20%
Priority Supported: NO
————-
Map tasks
Capacity: 4 slots
Maximum capacity: 9 slots
Used capacity: 6 (150.0% of Capacity)
Running tasks: 6
Active users:
User ‘root’: 6 (100.0% of used capacity)
————-
Reduce tasks
Capacity: 4 slots
Maximum capacity: 9 slots
Used capacity: 0 (0.0% of Capacity)
Running tasks: 0
————-
Job info
Number of Waiting Jobs: 0
Number of Initializing Jobs: 0
Number of users who have submitted jobs: 1

提示:我们设置的capacity是4,而实际的则是Used capacity: 6 (150.0% of Capacity),符合我们的预期。

使用capacity scheduler ,集群的资源得到有效的管理可控制,即不会让一个用户跑死整个集群,也不会管得过死造成资源闲置。