极客时间-轻松学习，高效学习-极客邦

Sunqc

2019-08-10

主分片删除成功，副分片没有成功，结果最终返回失败吗，还是不存在只有一方删除成功的情况呢

 1

 3
艾文

2019-08-11

老师，请问为什么es的文档没采用一致性hash算法呢？

作者回复: 这是一个很好的问题。我贴一段es文档里的描述，希望对你有一定的帮助

Why doesn’t Elasticsearch support incremental resharding?edit
Going from N shards to N+1 shards, aka. incremental resharding, is indeed a feature that is supported by many key-value stores. Adding a new shard and pushing new data to this new shard only is not an option: this would likely be an indexing bottleneck, and figuring out which shard a document belongs to given its _id, which is necessary for get, delete and update requests, would become quite complex. This means that we need to rebalance existing data using a different hashing scheme.

The most common way that key-value stores do this efficiently is by using consistent hashing. Consistent hashing only requires 1/N-th of the keys to be relocated when growing the number of shards from N to N+1. However Elasticsearch’s unit of storage, shards, are Lucene indices. Because of their search-oriented data structure, taking a significant portion of a Lucene index, be it only 5% of documents, deleting them and indexing them on another shard typically comes with a much higher cost than with a key-value store. This cost is kept reasonable when growing the number of shards by a multiplicative factor as described in the above section: this allows Elasticsearch to perform the split locally, which in-turn allows to perform the split at the index level rather than reindexing documents that need to move, as well as using hard links for efficient file copying.

In the case of append-only data, it is possible to get more flexibility by creating a new index and pushing new data to it, while adding an alias that covers both the old and the new index for read operations. Assuming that the old and new indices have respectively M and N shards, this has no overhead compared to searching an index that would have M+N shards.

 1

 1
低调光环

2019-08-08

请问，视频中更新和删除文档的请求，首先会发送到master节点吗，还是通过前置的负载均衡工具分发到某一个节点？

作者回复: 视频中发送到9200，我也没在开发环境中指定dedicated的节点。所以这个节点既是master也是data，当然肯定也是coordinating节点。

在生产环境，你可以设置dedicate的 coordinate节点，发查询到这些节点。不建议直接发送请求到master节点，虽然也会工作，但是大量请求发送到master，会有潜在的性能问题



 1
GaelYang

2019-08-06

有一个疑问？就是存在副本的情况，索引文档的时候，副本也是同时做索引的吗？

作者回复: 对。所以过多的副本会降低索引的速度



 1
刘应明

2019-08-04

我理解更新的流程也应该和删除一样同步副本分片吧？视频中没体现出来

作者回复: 你的理解没错，视频里要是体现出来，会使得表达更为严谨

 1

 1
godtrue

2019-09-21

有几个疑问：
1：添加文档的时候，文档ID那来的？这个ID是索引下唯一的吧？还是集群下唯一？
2：查询、修改、删除这些操作，也必须知道文档ID，此时的文档ID是通过倒排索引获取的吧？倒排索引这个数据结构具体存储在哪里呢？每个节点只有有操作文档的权限都需要获取文档的ID，换句话说都需要能获取倒排索引，如果倒排索引比较大怎么存？
3：现在基本分清楚了节点和分片以及文档的关系，节点大体分为主节点、路由节点、数据节点这几种，每一种都有其对应的职责；一个节点上可以有N个分片，分片专门用于存储数据，分片分为主分片和副本分片，一个索引可能会有N个主分片，此索引下的文档具体会被分配到那个分片上是根据动态计算的，分片号=哈希(文档ID)/主分片数，复分片用于数据备份以及分担读请求的压力。切记，同一个分片的主分片和复分片不可能在同一个节点上。
总于，弄明白了，很舒服，感谢！

展开




Coisini

2019-09-04

老师请问下当一个搜索请求进来时会被分配到副本分片上去查询结果么

作者回复: 会到主分片或者副本分片上查询，所以，不排除副本分片没有完全同步完成导致数据的不完全一致




汤尼房

2019-08-28

老师请教个问题，我在做热温数据迁移的过程中，想要对分片的relocating操作做监控，比如relocation了多大的数据量，速率如何？在ES官方文档没能找到答案，望老师给点提示。




标

2019-08-23

有个疑问，写入流程是否是这样子的，请求到达协调节点，协调节点负责转发和组装数据，协调节点转发到master节点，master操作hash（route）路由，决定写入哪个分片，master更新状态，返回给协调节点，协调节点最终返回给请求客户端




Sunqc

2019-08-21

老师，我是windows环境，没有用docker，，启动了两个实例，cluster名字一样，就是看不到集群信息，就那一次看到了，过了一个星期，现在又是一个星期，还是看不到，我感觉我卡在这里了

作者回复: 你可以一个个启动，先看第一个是否启动，再启动第二个，看 _cat/nodes里面能否看到新加入的节点

 1


Sunqc

2019-08-16

按照老师的例子上周启动了三个还可以看集群信息呢，今天准备复习一下，用cerebro访问9200，只能看到node1 ,看不到集群信息：
bin/elasticsearch -E node.name=node1 -E cluster.name=sunqc -E path.data=node1_data -E http.port=9200 -E transport.port=9300
bin/elasticsearch -E node.name=node2 -E cluster.name=sunqc -E path.data=node2_data -E http.port=9201 -E transport.port=9301 .这是node1启动信息：
============================node1======【node2信息一样】====
[2019-08-16T16:42:32,453][INFO ][o.e.t.TransportService ] [node1] publish_address {127.0.0.1:9300}, bound_addresses {127.0.0.1:9300}, {[::1]:9300}
[2019-08-16T16:42:32,472][WARN ][o.e.b.BootstrapChecks ] [node1] the default discovery settings are unsuitable for production use; at least one of [discovery.seed_hosts, discovery.seed_providers, cluster.initial_master_nodes] must be configured
[2019-08-16T16:42:32,505][INFO ][o.e.c.c.ClusterBootstrapService] [node1] no discovery configuration found, will perform best-effort cluster bootstrapping after [3s] unless existing master is discovered
[2019-08-16T16:42:35,509][INFO ][o.e.c.c.Coordinator ] [node1] setting initial configuration to VotingConfiguration{MZVYW8hkTyGf3A_hXeIo3Q}
[2019-08-16T16:42:35,743][INFO ][o.e.c.s.MasterService ] [node1] elected-as-master ([1] nodes joined)[{node1}{MZVYW8hkTyGf3A_hXeIo3Q}{09DZrBifRra_dVenD6nW7Q}{127.0.0.1}{127.0.0.1:9300}{ml.machine_memory=4172951552, xpack.installed=true, ml.max_open_jobs=20} elect leader, _BECOME_MASTER_TASK_, _FINISH_ELECTION_], term: 1, version: 1, reason: master node changed {previous [], current [{node1}{MZVYW8hkTyGf3A_hXeIo3Q}{09DZrBifRra_dVenD6nW7Q}{127.0.0.1}{127.0.0.1:9300}{ml.machine_memory=4172951552, xpack.installed=true, ml.max_open_jobs=20}]}
[2019-08-16T16:42:35,922][INFO ][o.e.c.s.ClusterApplierService] [node1] master node changed {previous [], current [{node1}{MZVYW8hkTyGf3A_hXeIo3Q}{09DZrBifRra_dVenD6nW7Q}{127.0.0.1}{127.0.0.1:9300}{ml.machine_memory=4172951552, xpack.installed=true, ml.max_open_jobs=20}]}, term: 1, version: 1, reason: Publication

展开

作者回复: 你尝试着docker-compose down -v 再启动一下吧




Erick

2019-08-07

请问删除文档的话，也是需要路由到Master节点，执行删除然后同步删除命令到副本执行删除操作吗？还是说路由到文档所在的主分片执行删除，然后同步删除命令到副本呢？



