hadoop 0.20.2 capacity scheduler 配置方法

本文采用的是CDH4+MAPREDUCE 0.20

hadoop中共有6台机作为tasktracker ,每台机配置map和reduce个2个slots

队列名 Capacity Maximum  Capacity
 defaults  10%
 edp  50%  90%
 hive  40%  80%

设置完资源后,设置队列的ACL,必须具有相关权限才能向指定队列中提交任务。

队列名 权限
 defaults 所有用户可提交
 edp xhdeng可提交
 hive root及hive用户可提交

登录jobtracker机器

将/usr/lib/hadoop-0.20-mapreduce/contrib/capacity-scheduler/下的hadoop-capacity-scheduler-2.0.0-mr1-cdh4.0.1.jar 复制到/usr/lib/hadoop/lib/目录下

修改/etc/hadoop/conf 下的mapred-site.xml 在其中新增

<property>
<name>mapred.jobtracker.taskScheduler</name>
<value>org.apache.hadoop.mapred.CapacityTaskScheduler</value>
</property>

<property>
<name>mapred.queue.names</name>
<value>default,edp,hive</value>
</property>

<property>
<name>mapred.acls.enabled</name>
<value>true</value>
<description> Specifies whether ACLs should be checked
for authorization of users for doing various queue and job level operations.
ACLs are disabled by default. If enabled, access control checks are made by
JobTracker and TaskTracker when requests are made by users for queue
operations like submit job to a queue and kill a job in the queue and job
operations like viewing the job-details (See mapreduce.job.acl-view-job)
or for modifying the job (See mapreduce.job.acl-modify-job) using
Map/Reduce APIs, RPCs or via the console and web user interfaces.
</description>
</property>

在/etc/hadoop/conf 下新建capacity-scheduler.xml

<?xml version=”1.0″?>

<!– This is the configuration file for the resource manager in Hadoop. –>
<!– You can configure various scheduling parameters related to queues. –>
<!– The properties for a queue follow a naming convention,such as, –>
<!– mapred.capacity-scheduler.queue.<queue-name>.property-name. –>

<configuration>

<property>
<name>mapred.capacity-scheduler.maximum-system-jobs</name>
<value>6</value>
<description>Maximum number of jobs in the system which can be initialized,
concurrently, by the CapacityScheduler.
</description>
</property>

<property>
<name>mapred.capacity-scheduler.queue.default.capacity</name>
<value>10</value>
<description>Percentage of the number of slots in the cluster that are
to be available for jobs in this queue.
</description>
</property>

<property>
<name>mapred.capacity-scheduler.queue.default.maximum-capacity</name>
<value>-1</value>
<description>
maximum-capacity defines a limit beyond which a queue cannot use the capacity of the cluster.
This provides a means to limit how much excess capacity a queue can use. By default, there is no limit.
The maximum-capacity of a queue can only be greater than or equal to its minimum capacity.
Default value of -1 implies a queue can use complete capacity of the cluster.

This property could be to curtail certain jobs which are long running in nature from occupying more than a
certain percentage of the cluster, which in the absence of pre-emption, could lead to capacity guarantees of
other queues being affected.

One important thing to note is that maximum-capacity is a percentage , so based on the cluster’s capacity
the max capacity would change. So if large no of nodes or racks get added to the cluster , max Capacity in
absolute terms would increase accordingly.
</description>
</property>

<property>
<name>mapred.capacity-scheduler.queue.default.supports-priority</name>
<value>false</value>
<description>If true, priorities of jobs will be taken into
account in scheduling decisions.
</description>
</property>

<property>
<name>mapred.capacity-scheduler.queue.default.minimum-user-limit-percent</name>
<value>100</value>
<description> Each queue enforces a limit on the percentage of resources
allocated to a user at any given time, if there is competition for them.
This user limit can vary between a minimum and maximum value. The former
depends on the number of users who have submitted jobs, and the latter is
set to this property value. For example, suppose the value of this
property is 25. If two users have submitted jobs to a queue, no single
user can use more than 50% of the queue resources. If a third user submits
a job, no single user can use more than 33% of the queue resources. With 4
or more users, no user can use more than 25% of the queue’s resources. A
value of 100 implies no user limits are imposed.
</description>
</property>

<property>
<name>mapred.capacity-scheduler.queue.default.user-limit-factor</name>
<value>1</value>
<description>The multiple of the queue capacity which can be configured to
allow a single user to acquire more slots.
</description>
</property>

<property>
<name>mapred.capacity-scheduler.queue.default.maximum-initialized-active-tasks</name>
<value>200000</value>
<description>The maximum number of tasks, across all jobs in the queue,
which can be initialized concurrently. Once the queue’s jobs exceed this
limit they will be queued on disk.
</description>
</property>

<property>
<name>mapred.capacity-scheduler.queue.default.maximum-initialized-active-tasks-per-user</name>
<value>100000</value>
<description>The maximum number of tasks per-user, across all the of the
user’s jobs in the queue, which can be initialized concurrently. Once the
user’s jobs exceed this limit they will be queued on disk.
</description>
</property>

<property>
<name>mapred.capacity-scheduler.queue.default.init-accept-jobs-factor</name>
<value>10</value>
<description>The multipe of (maximum-system-jobs * queue-capacity) used to
determine the number of jobs which are accepted by the scheduler.
</description>
</property>

<!– The default configuration settings for the capacity task scheduler –>
<!– The default values would be applied to all the queues which don’t have –>
<!– the appropriate property for the particular queue –>
<property>
<name>mapred.capacity-scheduler.default-supports-priority</name>
<value>false</value>
<description>If true, priorities of jobs will be taken into
account in scheduling decisions by default in a job queue.
</description>
</property>

<property>
<name>mapred.capacity-scheduler.default-minimum-user-limit-percent</name>
<value>100</value>
<description>The percentage of the resources limited to a particular user
for the job queue at any given point of time by default.
</description>
</property>
<property>
<name>mapred.capacity-scheduler.default-user-limit-factor</name>
<value>1</value>
<description>The default multiple of queue-capacity which is used to
determine the amount of slots a single user can consume concurrently.
</description>
</property>

<property>
<name>mapred.capacity-scheduler.default-maximum-active-tasks-per-queue</name>
<value>200000</value>
<description>The default maximum number of tasks, across all jobs in the
queue, which can be initialized concurrently. Once the queue’s jobs exceed
this limit they will be queued on disk.
</description>
</property>

<property>
<name>mapred.capacity-scheduler.default-maximum-active-tasks-per-user</name>
<value>100000</value>
<description>The default maximum number of tasks per-user, across all the of
the user’s jobs in the queue, which can be initialized concurrently. Once
the user’s jobs exceed this limit they will be queued on disk.
</description>
</property>

<property>
<name>mapred.capacity-scheduler.default-init-accept-jobs-factor</name>
<value>10</value>
<description>The default multipe of (maximum-system-jobs * queue-capacity)
used to determine the number of jobs which are accepted by the scheduler.
</description>
</property>

<!– Capacity scheduler Job Initialization configuration parameters –>
<property>
<name>mapred.capacity-scheduler.init-poll-interval</name>
<value>5000</value>
<description>The amount of time in miliseconds which is used to poll
the job queues for jobs to initialize.
</description>
</property>
<property>
<name>mapred.capacity-scheduler.init-worker-threads</name>
<value>5</value>
<description>Number of worker threads which would be used by
Initialization poller to initialize jobs in a set of queue.
If number mentioned in property is equal to number of job queues
then a single thread would initialize jobs in a queue. If lesser
then a thread would get a set of queues assigned. If the number
is greater then number of threads would be equal to number of
job queues.
</description>
</property>

<property>
<name>mapred.capacity-scheduler.queue.hive.capacity</name>
<value>40</value>
</property>
<property>
<name>mapred.capacity-scheduler.queue.hive.maximum-capacity</name>
<value>80</value>
</property>
<property>
<name>mapred.capacity-scheduler.queue.hive.supports-priority</name>
<value>false</value>
</property>
<property>
<name>mapred.capacity-scheduler.queue.hive.minimum-user-limit-percent</name>
<value>20</value>
</property>
<property>
<name>mapred.capacity-scheduler.queue.hive.user-limit-factor</name>
<value>10</value>
</property>
<property>
<name>mapred.capacity-scheduler.queue.hive.maximum-initialized-active-tasks</name>
<value>200000</value>
</property>
<property>
<name>mapred.capacity-scheduler.queue.hive.maximum-initialized-active-tasks-per-user</name>
<value>100000</value>
</property>
<property>
<name>mapred.capacity-scheduler.queue.hive.init-accept-jobs-factor</name>
<value>100</value>
</property>

<!– queue: edp –>
<property>
<name>mapred.capacity-scheduler.queue.edp.capacity</name>
<value>50</value>
</property>
<property>
<name>mapred.capacity-scheduler.queue.edp.maximum-capacity</name>
<value>90</value>
</property>
<property>
<name>mapred.capacity-scheduler.queue.edp.supports-priority</name>
<value>false</value>
</property>
<property>
<name>mapred.capacity-scheduler.queue.edp.minimum-user-limit-percent</name>
<value>100</value>
</property>
<property>
<name>mapred.capacity-scheduler.queue.edp.user-limit-factor</name>
<value>1</value>
</property>
<property>
<name>mapred.capacity-scheduler.queue.edp.maximum-initialized-active-tasks</name>
<value>200000</value>
</property>
<property>
<name>mapred.capacity-scheduler.queue.edp.maximum-initialized-active-tasks-per-user</name>
<value>100000</value>
</property>
<property>
<name>mapred.capacity-scheduler.queue.edp.init-accept-jobs-factor</name>
<value>10</value>
</property>

</configuration>

在/etc/hadoop/conf 下新建mapred-queue-acls.xml

<configuration>
<property>
<name>mapred.queue.edp.acl-submit-job</name>
<value>xhdeng</value>
<description> Comma separated list of user and group names that are allowed
to submit jobs to the ‘default’ queue. The user list and the group list
are separated by a blank. For e.g. user1,user2 group1,group2.
If set to the special value ‘*’, it means all users are allowed to
submit jobs. If set to ‘ ‘(i.e. space), no user will be allowed to submit
jobs.

It is only used if authorization is enabled in Map/Reduce by setting the
configuration property mapred.acls.enabled to true.
Irrespective of this ACL configuration, the user who started the cluster and
cluster administrators configured via
mapreduce.cluster.administrators can submit jobs.
</description>
</property>
<property>
<name>mapred.queue.hive.acl-submit-job</name>
<value>hive,root</value>
<description> Comma separated list of user and group names that are allowed
to submit jobs to the ‘default’ queue. The user list and the group list
are separated by a blank. For e.g. user1,user2 group1,group2.
If set to the special value ‘*’, it means all users are allowed to
submit jobs. If set to ‘ ‘(i.e. space), no user will be allowed to submit
jobs.

It is only used if authorization is enabled in Map/Reduce by setting the
configuration property mapred.acls.enabled to true.
Irrespective of this ACL configuration, the user who started the cluster and
cluster administrators configured via
mapreduce.cluster.administrators can submit jobs.
</description>
</property>
<property>
<name>mapred.queue.default.acl-submit-job</name>
<value>*</value>
<description> Comma separated list of user and group names that are allowed
to submit jobs to the ‘default’ queue. The user list and the group list
are separated by a blank. For e.g. user1,user2 group1,group2.
If set to the special value ‘*’, it means all users are allowed to
submit jobs. If set to ‘ ‘(i.e. space), no user will be allowed to submit
jobs.

It is only used if authorization is enabled in Map/Reduce by setting the
configuration property mapred.acls.enabled to true.
Irrespective of this ACL configuration, the user who started the cluster and
cluster administrators configured via
mapreduce.cluster.administrators can submit jobs.
</description>
</property>
</configuration>

重启jobtracker,登录hadoop的map/reduce administration即可发现新的调度器生效。

 

Scheduling Information

Queue Name State Scheduling Information
hive running Queue configuration
Capacity Percentage: 40.0%
User Limit: 20%
Priority Supported: NO
————-
Map tasks
Capacity: 4 slots
Maximum capacity: 9 slots
Used capacity: 0 (0.0% of Capacity)
Running tasks: 0
————-
Reduce tasks
Capacity: 4 slots
Maximum capacity: 9 slots
Used capacity: 0 (0.0% of Capacity)
Running tasks: 0
————-
Job info
Number of Waiting Jobs: 0
Number of Initializing Jobs: 0
Number of users who have submitted jobs: 0
default running Queue configuration
Capacity Percentage: 10.0%
User Limit: 100%
Priority Supported: NO
————-
Map tasks
Capacity: 1 slots
Used capacity: 0 (0.0% of Capacity)
Running tasks: 0
————-
Reduce tasks
Capacity: 1 slots
Used capacity: 0 (0.0% of Capacity)
Running tasks: 0
————-
Job info
Number of Waiting Jobs: 0
Number of Initializing Jobs: 0
Number of users who have submitted jobs: 0
edp running Queue configuration
Capacity Percentage: 50.0%
User Limit: 100%
Priority Supported: NO
————-
Map tasks
Capacity: 6 slots
Maximum capacity: 10 slots
Used capacity: 0 (0.0% of Capacity)
Running tasks: 0
————-
Reduce tasks
Capacity: 6 slots
Maximum capacity: 10 slots
Used capacity: 0 (0.0% of Capacity)
Running tasks: 0
————-
Job info
Number of Waiting Jobs: 0
Number of Initializing Jobs: 0
Number of users who have submitted jobs: 0

 

如果修改了调度器的配置文件,无需重启整个jobtracker,使用以下命令刷新即可。

hadoop mradmin -refreshQueues

查看当前用户的acl可以使用以下命令查看。

mapred queue -showacls

Queue acls for user : root

Queue Operations
=====================
hive submit-job,administer-jobs
default submit-job,administer-jobs
edp administer-jobs

用root用户往edp队列中跑一个任务测试一下:

sudo  hadoop jar    hadoop-mapreduce-client-jobclient-2.0.0-cdh4.0.1-tests.jar TestDFSIO -D mapred.job.queue.name=edp  -write -nrFiles 6 -fileSize 1000

然后,必然的,报错了。

12/11/23 16:16:19 ERROR security.UserGroupInformation: PriviledgedActionException as:root (auth:SIMPLE) cause:org.apache.hadoop.security.AccessControlException: User root cannot perform operation SUBMIT_JOB on queue edp.
Please run “hadoop queue -showacls” command to find the queues you have access to .

用root用户往edp用户中跑一个任务测试一下:

sudo  hadoop jar    hadoop-mapreduce-client-jobclient-2.0.0-cdh4.0.1-tests.jar TestDFSIO  -write -nrFiles 6 -fileSize 1000

然后,必然的,报错了。

default running Queue configuration
Capacity Percentage: 10.0%
User Limit: 100%
Priority Supported: NO
————-
Map tasks
Capacity: 1 slots
Used capacity: 1 (100.0% of Capacity)
Running tasks: 1
Active users:
User ‘root’: 1 (100.0% of used capacity)
————-
Reduce tasks
Capacity: 1 slots
Used capacity: 0 (0.0% of Capacity)
Running tasks: 0
————-
Job info
Number of Waiting Jobs: 0
Number of Initializing Jobs: 0
Number of users who have submitted jobs: 1

跑起来了,很悲催的是只能使用1个slot。。。

那往hive队列上再调一次。。

 

Queue Name State Scheduling Information
hive running Queue configuration
Capacity Percentage: 40.0%
User Limit: 20%
Priority Supported: NO
————-
Map tasks
Capacity: 4 slots
Maximum capacity: 9 slots
Used capacity: 6 (150.0% of Capacity)
Running tasks: 6
Active users:
User ‘root’: 6 (100.0% of used capacity)
————-
Reduce tasks
Capacity: 4 slots
Maximum capacity: 9 slots
Used capacity: 0 (0.0% of Capacity)
Running tasks: 0
————-
Job info
Number of Waiting Jobs: 0
Number of Initializing Jobs: 0
Number of users who have submitted jobs: 1

提示:我们设置的capacity是4,而实际的则是Used capacity: 6 (150.0% of Capacity),符合我们的预期。

使用capacity scheduler ,集群的资源得到有效的管理可控制,即不会让一个用户跑死整个集群,也不会管得过死造成资源闲置。

 

 

 

 

攻城师的沟通修炼

软件开发做久了,都会有一些角色的转变,从最初的直接leader分配具体工作来做,到后来需要和不同部门一起协作开发,沟通都是必备技能与工具,无论升级为TL(team leader)、PM(project manager),还是继续做软件工程师。最初做研发时,可能更多的只是关注技术性的专业知识上,觉得做好分配给自己的本职工作是唯一目标。当你技术能力提升到一定高度的时候,领导会分配更大的项目让你负责,这时你发现,写代码不再是你8小时的全部内容,你需要和产品经理探讨需求,需要和其他攻城师讨论技术方案,需要和其他部门协调项目进度,需要和测试人员确认测试流程。。。。等等等等,你突然发现,沟通变得越来越重要,它已经成为你完成特定行动唯一真正有效的手段。
这种转型,对很多工程师来说是痛苦的,很多人当初选择做研发是觉得程序开发只面对机器而不是人,正好符合自己不善言辞的木讷内向性格,当自己在这行业学到很多技术,自以为可以松一口气的时候,突然发现自己的性格竟然不适合这个行业了,你是否在有了很多技术经验后有过这样的迷茫?
你是否在和别人沟通需求时这个表情?

你是否也像我曾经一样,在别人说观点时一直在寻找一次讲话的机会?或者干脆马上粗暴的打断别人来表达自己的立场?在产品经理提了一个大的需求改动时马上说这个做不了?在提出技术方案后意识到错误,但为了程序员的自尊以及觉得其他人都不懂技术而不去做道歉不做修改?在测试人员指出某个问题时觉得测试人员太过于较真,觉得这种情况一万年也不会出现而懒得修改?
最近读了一本工程师修炼的书,里面讲了最重要的一条就是沟通。
无论你从程序员升级为TL,还是PM,还是架构师,或者是高级工程师,其实都是一种晋级。沟通技术知识对你往上升是非常关键的技能。这种技能通常意味着维护你的职位,明确特定项目的潜在风险和当前问题。在单位等级结构的这一层上,你应该阻止产生问题、寻找问题并且解决问题。你的上级都在盯着看你的每一步动作。压力往往会非常大。对于靠技术吃饭的人来说,若想迈出跳至管理的第一步,阶梯上的下个台阶的特性已经大大变化了。尤其是,首先要求的技能是沟通范围、数量大大地拓宽了。

一、沟通原则
学习有效沟通是一个终身的过程——永远都有改善的余地。要学习的沟通原则包括:先听后说、专心致志(人和心思在一处)、正面思考等,这些原则有助于建立与别人的信任关系,使你成为更高超的沟通者。

1. 先听后说
你有没有发现自己在某次谈话中总是想寻求一次讲话的机会,而没有真正在听别人说什么?当你没有听时,你传递给那个对你讲话的人什么信息呢?
至少表面上,你显得不在乎别人说什么。大部分人会很快厌倦这样的谈话,因为他们说的话白白在空气中传播却没人去听。说话的人也许会想他还有更好的事要做,而结束此次谈话。如果你只是偶尔这样做,这种行为没什么大不了的。如果这是你的习惯做法,那么你是在自己与别人之间构筑一道墙。
你听的时候,是不是在找机会纠正对方?即便谈论的话题在往前走,但是你的思路还停留在刚才的某一点上?
这种情况说明,我们并没有在听别人说什么。讲话的人对你很在乎,从其忙碌的工作中抽出时间,为你提供这些宝贵的信息,所以应该认真去听他说什么。
当有人与你说话时,要看着他说,并试着理解他想沟通的内容。给对方足够的时间来表达他的观点,然后再向其询问要澄清的问题。向他表达非语言的反馈,例如点头,让他知道你在关注这次谈话。
我认为罗马人Epictetus说得好:“我们有两个耳朵,一个嘴巴,所以我们应该多听少说。”
2. 专心致志
不管你在哪里,都应专心致志。生活中有许多事情要分神,例如这个周末你要做什么,几分钟前开会回来如何解决会上的问题,怎样找个办法告诉老板某个负面的消息,小孩今晚的英式足球比赛几点开场……所有这些琐事都很容易让你想入非非。
一般来说,人在任何时刻最多只能同时处理7件±2件那么多的事。如果你的脑袋全是一些无关紧要的琐事,你就无法专心致志地做事。要是有人对你说话,你就会完全听不到他在讲什么。倘若他在问你问题,你可能要他们再说一遍。这种情况下,你其实是在浪费别人的时间,他们不会高兴的。如果房间里有个执行官,你就会给人家留下一个持久的坏印象:真是个浪费钱的家伙!
请你读些时间管理方面的书,列出每天需要关注事情的清单计划。在计划中安排好任务的优先级(当天、本周等),并标识每个任务准备投入的时间。这个办法能够让你通过计划安排好每件事,节省你的精力去记忆周围各种正在发生的事。
如果某个会议不是真的需要你参加,你就不要去;假如确实需要你参加,就一定要去,而且人和心思都花在那里。
我发现,坐直、将脚放在座位正下方、做笔记、直视正在讲话的人,能够自然而然地全神贯注于会议正在进行的事情,从说话和肢体语言两方面都给人以积极参与的正面印象。
不管你做什么,都要专心致志!
3. 正面思考
当你表达信息时,总有许多种方法去传递它们。信息需要真实和准确,然而表现出其意义的方式可以多种多样。
你可以以积极意义或消极意义提供信息。你可以基于所期望的结果选择某种方法。也可以采用不偏不倚的方式,不带情绪地列举事实,尽管这通常很难做到。
从沟通的观点来看,人们容易注意负面的东西。通常负面消息总会带来恐惧(当我感觉恐惧邻近时,我会把它当做“要求集中精力的行动”的信号)。
作为架构师,你需要避免不必要的偏见信息,让别人能够选择他们要关注的信息。你可以提供若干种替代方案,但这些方案应当是客观平等的。你需要察觉可能的办法,而不是为别人留下疑惑。
4. 尽早道歉
在一天的事务中,你可能注意到对他人做的某个事情不合适或不正确。记住放下自尊去给受影响的对方道一个歉。向别人诚心道歉并不是好玩或者容易之举,但你可以赢得别人的尊敬,展示你在尽力成长,尝试变得更好的意图。
如果你道歉,对方就有可能重新审视事情,而原谅你带来的任何苦恼伤痛。有些让人尴尬的事情转而有了积极意义。你与那人的关系就有了增进的机会,而不是就此冷淡。
人的本能倾向就是让冒犯别人后的情势不了了之。遗憾的是,你可能埋下了让它长大成祸患的种子,以致对你造成长期的影响。被得罪的人可能会耿耿于怀,在很长很长时间内记住这件事。那个人也许会把这件事告诉别人,说你是个什么类型的人。你和此人及周围其他人的交往能力可能大打折扣。最后,或许你已经忘记做过的事,但是对方却没有忘记。
道歉时,你要清楚地表达出要道歉的是什么,你说的是什么意思。如果你不是诚心道歉,虚假的说辞可能把事情弄得更糟。如果你不能表达诚意,就不要道歉,但你的目标应当是努力与你所交往的人修缮积极的关系。避免让道歉使你向错误的方向发展,限制你的个人成长。
5. 不要在缺陷上招致恼羞成怒
当你在开评审会(例如产品概念评估、需求评审、设计评审、代码评审、测试评审、产品发布评审)时,通常会检查出评审项目的一些缺陷。评审项目的作者对于这些暴露出的缺陷当然会感到不自在。

二、沟通策略
在我们研究了沟通的核心原则后,现在你可以应用一系列策略来展示恒定、高效的沟通风格。
1,多说“是”,少说“不是”
多想解决方案,解决问题,而不是粗暴的反对。
2,抑制想自卫的冲动
通常在交谈中,当我们听到并不完全对自己有正面意义的事情时,我们可能会找借口,我们可能会找办法转移话题,并责怪他人,以使自己脱离干系。或者我们想强词夺理,以阐述那些语句。应当避免做出此反应的冲动。相反,代之以等待,并接受别人所说的话。
在上一段描述的反应中,会谈的真正兴趣已经从对别人转移到你身上。听别人说话的行为至少暂时结束了,我们开始与会谈者发出警示信号:“我们把话题引向另一个方向吧,一个和我没关系的方向。”注意你正在用的肢体语言—胳膊交叉在胸前,或者头转向一边告诉别人“我不想听了”。
问自己这个问题:我能从这个人说的话中学到什么?”通常,他给出的信息也许并不是你乐意听到的,但其动机是好的,仍是你接受信息并获得个人成长的机会。
抑制想自卫的冲动的一个例外就是当手头的问题涉及企业政策或你的正直时。如果别人说的话使你真正涉及与公司政策冲突或你出于正直未做某事(假如,你已经正确做出了行动)时,你需要立即抨击这些说法。你可能想用澄清问题的办法来明确要点,比如“你的意思是我做过某事吗”。如果别人说“是的”,你就以“这并不准确”来明确回应;倘若人家回答“不是”,要感谢他澄清了此事。

转自:http://tech.weibo.com/?p=2103

 

ORACLE 的DRIVING_SITE HINT 在INSERT AS SELECT中无效

一个朋友问个SQL在本地表连接执行的部分非常快,当执行到最后通过DBLINK连接远程DB表时就很慢(本地执行后记录小于100条然后和远程DB连接。lot@pkgmes lt表1000W 并且有连接字段的PK)。后来在select部分添加hint /*+  DRIVING_SITE (lt) */  后,会把本地表传到远程去比对,子查询速度在1s以内。但是加上前面的insert hint好像是失效了,又变得很慢表和索引的统计信息都是最新的。

其实不光是INSERT AS SELECT中无效CREATE TABLE AS SELECT之类的也会无效

原因:

What happened?  That’s actually expected behaviour, a distributed DML statement must execute on the database where the DML target resides. The DRIVING_SITE hint cannot override this. DRIVING_SITE hint means that entire cursor (not a subquey) is supposed to be mapped remotely. That also means CREATE TABLE cannot be executed remotely (which is also the reason why you get ORA-2021 when you try to accomplish this with an Create Table table_name@remote_database).

So keep in mind when using the DRIVING_SITE hint this is merely for query optimization and not intended for DML or DDL.

解决办法
Create a view on the remote database (A) and then issue the insert
query by selecting from the view@link_name.

 

产品经理与项目经理

最近在做Gbase测试和使用公司一个内部研发产品的时候,中间有非常多的不顺利,究其原因很明显,开发的时候只关注核心功能的实现,而对于用户体验最重要的界面,提示,日志,操作方法等等反而在开发时候置于次要的,无关紧要的地位

看了一句话觉得挺好:“产品经理的职责是探索(定义)有价值、可用的、可行的产品;而项目经理的职责是关注如何执行计划以按期交付产品。”,我觉得还有一点区别是项目经理对客户负责,产品经理是对用户负责。面向客户!=面向用户。如果产品功能很NB,但是对用户非常不友好,用户体验极差,完全可以想象这是一个项目经理而非产品经理做出来的东西