• suncar
    2019-07-01
    请问一下老师,可不可提供几个获取大量测试数据的网止。谢谢

    作者回复: 谢谢留言!我比较推荐kaggle的datasets。

    
     3
  • 明翼
    2019-07-02
    想问下读者中多少人用beam在生产环境…
     5
     2
  • ditiki
    2019-07-03
    请教两个production遇到的问题.

    In a beam pipeline (dataflow), one step is to send http request to schema registry to validate event schema. A groupby event type before this step and static cache are used to reduce calls to schema registry. How does beam (or the underline runner) optimise IO ? Is it a good practice to use a thread pool for asynchronous http calls ?

    The event object has a Json (json4s library) payload, each time we try to update the Dataflow pipeline, we get the error says that the Kryo coder generated for the JSON has changed, such that the current pipeline can’t be updated in place. We did a work a round by serialise the Json payload to string in a custom coder, which should be very inefficient. Have you ever seen this before ? Does Kryo generate a different coder at each compile time ?

    多谢啦!
    展开
    
    
我们在线,来聊聊吧