CN110008970A

CN110008970A - A kind of address information generation method and device

Info

Publication number: CN110008970A
Application number: CN201810011871.1A
Authority: CN
Inventors: 郑晓琳; 徐沛; 何海泉; 梁晓飞; 李永华; 王喜春; 平安生; 刘龙飞; 肖勇; 张泽南; 于泷; 杨高强; 孙志涛; 尹雪梅; 姜波; 王铭江
Original assignee: Beijing Jingdong Century Trading Co Ltd; Beijing Jingdong Shangke Information Technology Co Ltd
Current assignee: Beijing Jingdong Century Trading Co Ltd; Beijing Jingdong Shangke Information Technology Co Ltd
Priority date: 2018-01-05
Filing date: 2018-01-05
Publication date: 2019-07-12
Anticipated expiration: 2038-01-05
Also published as: CN110008970B

Abstract

The invention discloses a method and a device for generating address information, and relates to the technical field of computers. A specific implementation of the method includes: acquiring online object demand data and offline geographic environment data; using a clustering algorithm to respectively aggregate the online object demand data and the offline geographic environment data to obtain an initial Aggregate data points; determine final aggregated data points based on the initial aggregated data points. This embodiment can solve the problems of inaccurate addresses and low efficiency in the prior art selected by offline stores.

Description

A method and device for generating address information

技术领域technical field

本发明涉及计算机技术领域，尤其涉及一种地址信息生成方法和装置。The present invention relates to the technical field of computers, and in particular, to a method and device for generating address information.

背景技术Background technique

随着电子商务技术的发展，大型电商企业积累了大量技术实力，同时以相对低廉的价格和快捷方便的服务受到了广大消费者的喜爱，并积累了大量线上数据。但线下门店也有其不可比拟的优势，对于一些商品价格低、购买频次高或者日常急需的商品，消费者更青睐于在线下门店进行购买，能在节省运费的同时，更快的获取所需的商品。With the development of e-commerce technology, large-scale e-commerce enterprises have accumulated a lot of technical strength, and at the same time, they have been loved by consumers with relatively low prices and fast and convenient services, and have accumulated a large amount of online data. However, offline stores also have their incomparable advantages. For some commodities with low prices, high purchase frequency or urgent daily needs, consumers prefer to buy in offline stores, which can save freight and get what they need faster. Products of.

对于电商企业而言，通过前期积累的大量线下数据，通过选址技术，开发线下体验店，增加了商品的销售渠道，填补了部分物品线上销售量低的空白，同时，对于品牌传播，也有非常重要的意义。For e-commerce companies, through a large amount of offline data accumulated in the early stage, through site selection technology, the development of offline experience stores has increased the sales channels of products and filled the gap of low online sales of some items. At the same time, for brands Communication is also very important.

门店的选址是电商企业开拓线下智能门店面临的首要问题，如果选址不恰当，将引起管理成本过高、客流量不充足等问题。电商企业可以利用已经积累的大量线上数据，通过商品的需求地理分布，结合地理环境数据，选择需求较为集中、人流密集且交通方便的区域，进行线下智能门店的选址。The location of the store is the primary problem faced by e-commerce companies to develop offline smart stores. If the location is not appropriate, it will cause problems such as high management costs and insufficient passenger flow. E-commerce companies can use the large amount of accumulated online data, through the geographical distribution of commodity demand, combined with geographical environment data, to select areas with relatively concentrated demand, dense traffic and convenient transportation, and select the location of offline smart stores.

在实现本发明过程中，发明人发现现有技术中至少存在如下问题：In the process of realizing the present invention, the inventor found that there are at least the following problems in the prior art:

目前电商企业的选址主要有两类主流技术：一是通过收集地理环境数据(线下数据)，预估线下智能门店开业后可能的日均营业额(日均人流量*入店率*客单价),再减去相应的运营成本，通过收益最大化的原则进行选址。一是采用电商企业以往的线上数据，通过线上数预估日均营业额，再通过收益最大化的原则进行选址。At present, there are two main types of mainstream technologies for site selection of e-commerce companies: First, by collecting geographic environment data (offline data), estimating the possible average daily turnover after the opening of offline smart stores (average daily traffic * store entry rate) *Customer unit price), minus the corresponding operating costs, and select the site based on the principle of maximizing revenue. The first is to use the previous online data of e-commerce companies, estimate the average daily turnover through online data, and then select the site based on the principle of maximizing revenue.

但是，单纯的线上数据或者下线数据都没有完整的反应消费者需求：仅利用线下数据，无法利用电商企业已积累的线上数据优势，后续当电商将线上流程整合至线下时可能会出现线下经营的“水土不服”。仅利用线上数据，因线上数据不可能完整的记录所有相关地理环境数据，因此可能导致选取的门店并不符合线下需求。However, pure online data or offline data do not fully reflect the needs of consumers: only using offline data can not take advantage of the online data advantages that e-commerce companies have accumulated. When offline, there may be "acclimatization" in offline operations. Only using online data, because online data cannot completely record all relevant geographic environment data, it may lead to selected stores that do not meet offline needs.

发明内容SUMMARY OF THE INVENTION

有鉴于此，本发明实施例提供一种地址信息生成方法和装置，可以解决现有技术中线下门店选取的地址不准确，效率低的问题。In view of this, the embodiments of the present invention provide a method and device for generating address information, which can solve the problems of inaccurate and low efficiency of addresses selected by offline stores in the prior art.

为实现上述目的，根据本发明实施例的一个方面，提供了一种地址信息生成方法，包括获取线上对象需求数据和线下地理环境数据；利用聚类算法分别对所述线上对象需求数据和所述线下地理环境数据进行聚合，以获得初始的聚集数据点；根据所述初始的聚集数据点，确定最终聚集数据点。In order to achieve the above object, according to an aspect of the embodiments of the present invention, a method for generating address information is provided, which includes acquiring online object demand data and offline geographic environment data; Aggregate with the offline geographic environment data to obtain initial aggregated data points; and determine final aggregated data points according to the initial aggregated data points.

可选地，所述根据所述初始聚集数据点，确定最终聚集数据点，包括：Optionally, the determining of the final aggregated data points according to the initial aggregated data points includes:

计算线上对象需求数据点与初始的聚集数据点的距离，以确定所述线上对象需求数据点的分类；以及计算线下地理环境需求数据点与聚集数据点的距离，以确定所述线下地理环境需求数据点的分类；其中，线上对象需求数据包括线上对象需求数据点；Calculate the distance between the online object demand data point and the initial aggregated data point to determine the classification of the online object demand data point; and calculate the distance between the offline geographic environment demand data point and the aggregated data point to determine the line The classification of data points of geographical environment requirements; wherein, online object demand data includes online object demand data points;

根据线上对象需求数据点分类和线下地理环境需求数据点分类，以获取最终聚集数据点。According to the classification of online object demand data points and the offline geographical environment demand data point classification, the final aggregated data points are obtained.

可选地，计算线上对象需求数据点与初始的聚集数据点的距离，以确定所述线上对象需求数据点的分类，包括：Optionally, calculating the distance between the online object demand data point and the initial aggregated data point to determine the classification of the online object demand data point, including:

计算每个线上对象需求数据点与初始的聚集数据点的距离；Calculate the distance between each online object demand data point and the initial aggregated data point;

根据所述距离，计算加权距离，以确定与线上对象需求数据点距离最近的聚集数据点作为所述线上对象需求数据点的分类；According to the distance, the weighted distance is calculated to determine the aggregation data point closest to the online object demand data point as the classification of the online object demand data point;

另外，计算线下地理环境需求数据点与聚集数据点的距离，以确定所述线下地理环境需求数据点的分类，包括：In addition, the distance between the offline geographic environment demand data points and the aggregated data points is calculated to determine the classification of the offline geographic environment demand data points, including:

计算每个线下地理环境需求数据点与初始的聚集数据点的距离；Calculate the distance between each offline geographic environment demand data point and the initial aggregated data point;

根据所述距离，计算加权距离，以确定与线下地理环境需求数据点距离最近的聚集数据点作为所述线下地理环境需求数据点的分类。According to the distance, a weighted distance is calculated to determine an aggregated data point closest to the offline geographic environment demand data point as a classification of the offline geographic environment demand data point.

可选地，所述根据线上对象需求数据点分类和线下地理环境需求数据点分类，以获取最终聚集数据点，包括：Optionally, the classification according to online object demand data points and offline geographic environment demand data point classification to obtain final aggregated data points includes:

对于每个线上对象需求数据点分类的位置信息，计算预设半径以内的所有线上对象需求数据点的质心；同样，对于每一个线下地理环境需求数据点分类的位置信息，计算预设半径以内的所有线下地理环境需求数据点的质心；For the location information of each online object demand data point classification, calculate the centroid of all online object demand data points within the preset radius; similarly, for the location information of each offline geographic environment demand data point classification, calculate the preset The centroid of all offline geographic environment demand data points within the radius;

根据所有线上对象需求数据点的质心和所有线下地理环境需求数据点的质心，以获取最终聚集数据点。Based on the centroid of all online object demand data points and the centroid of all offline geographic environment demand data points, the final aggregated data points are obtained.

可选地，根据线上对象需求数据点分类和线下地理环境需求数据点分类，迭代更新聚集数据点直至达到预设条件，以获取最终聚集数据点。Optionally, according to the classification of online object demand data points and the classification of offline geographical environment demand data points, the aggregated data points are iteratively updated until a preset condition is reached, so as to obtain the final aggregated data points.

可选地，根据线上对象需求数据点分类和线下地理环境需求数据点分类，迭代更新聚集数据点，包括：Optionally, according to the classification of online object demand data points and the classification of offline geographical environment demand data points, iteratively update the aggregated data points, including:

对于每个线上对象需求数据点分类的位置信息，计算预设半径以内的所有线上对象需求数据点的质心A₁′(x_a1′,y_a1′)、B₁′(x_b1′,y_b1′)、C₁′(x_c1′,y_c1′)、D₁′(x_d1′,y_d1′)……；同样，对于每一个线下地理环境需求数据点分类的位置信息，计算预设半径以内的所有线下地理环境需求数据点的质心A₁″(x_a1″,y_a1″)、B₁″(x_b1″,y_b1″)、C₁″(x_c1″,y_c1″)、D₁″(x_d1″,y_d1″)……；For the classified location information of each online object demand data point, calculate the centroids A ₁ ′(x _a1 ′,y _a1 ′), B ₁ ′(x b1 ′, B 1 ′(x _b1 ′, y _b1 ′), C ₁ ′ (x _c1 ′, y _c1 ′), D ₁ ′ (x _d1 ′, y _d1 ′)...; Similarly, for the location information of each offline geographic environment demand data point classification, Calculate the centroids A ₁ ″(x _a1 ″,y _a1 ″), B ₁ ″(x _b1 ″,y _b1 ″), C ₁ ″(x _c1 ″), C 1 ″(x c1 ″) of all offline geographic environment demand data points within the preset radius, y _c1 ″), D ₁ ″(x _d1 ″, y _d1 ″)…;

确定下一轮迭代的聚集数据点：Determine the aggregated data points for the next iteration:

N₁(x,y)＝[N′₁(x′_n1,y′_n1)+β×N″₁(x″_n1,y″_n1)]÷2N ₁ (x,y)＝[N′ ₁ (x′ _n1 ,y′ _n1 )+β×N″ ₁ (x″ _n1 ,y″ _n1 )]÷2

其中：in:

N∈{A,B,C,D……}N∈{A,B,C,D…}

β＞1时：线下地理环境需求数据点的影响大于线上对象需求数据点的影响；When β>1: the influence of offline geographical environment demand data points is greater than the influence of online object demand data points;

β＜1时：线下地理环境需求数据点的影响小于线上对象需求数据点的影响；When β<1: the influence of offline geographical environment demand data points is less than the influence of online object demand data points;

β＝1时：线下地理环境需求数据点的影响等于线上对象需求数据点的影响。When β=1: the influence of offline geographical environment demand data points is equal to the influence of online object demand data points.

可选地，所述预设条件为：更新后聚集数据点与更新前聚集数据点的距离小于误差距离，或者循环次数大于或等于设定的循环最大次数阈值。Optionally, the preset condition is: the distance between the aggregated data points after the update and the aggregated data points before the update is less than the error distance, or the number of cycles is greater than or equal to a set threshold for the maximum number of cycles.

可选地，所述聚类算法采用k-means聚类算法。Optionally, the clustering algorithm adopts a k-means clustering algorithm.

另外，根据本发明实施例的一个方面，提供了一种地址信息生成装置，包括获取模块，用于获取线上对象需求数据和线下地理环境数据；计算模块，用于利用聚类算法分别对所述线上对象需求数据和所述线下地理环境数据进行聚合，以获得初始的聚集数据点；选取模块，用于根据所述初始的聚集数据点，确定最终聚集数据点。In addition, according to an aspect of the embodiments of the present invention, an apparatus for generating address information is provided, including an acquisition module for acquiring online object demand data and offline geographic environment data; a calculation module for using a clustering algorithm to The online object demand data and the offline geographic environment data are aggregated to obtain initial aggregated data points; a selection module is configured to determine final aggregated data points according to the initial aggregated data points.

可选地，所述选取模块根据所述初始聚集数据点，确定最终聚集数据点，包括：Optionally, the selection module determines the final aggregated data points according to the initial aggregated data points, including:

可选地，所述选取模块计算线上对象需求数据点与初始的聚集数据点的距离，以确定所述线上对象需求数据点的分类，包括：Optionally, the selection module calculates the distance between the online object demand data point and the initial aggregated data point to determine the classification of the online object demand data point, including:

另外，所述计算模块计算线下地理环境需求数据点与初始的聚集数据点的距离，以确定所述线下地理环境需求数据点的分类，包括：In addition, the computing module calculates the distance between the offline geographic environment demand data point and the initial aggregated data point to determine the classification of the offline geographic environment demand data point, including:

可选地，所述选取模块根据线上对象需求数据点分类和线下地理环境需求数据点分类，以获取最终聚集数据点，包括：Optionally, the selection module obtains the final aggregated data points according to the classification of online object demand data points and offline geographical environment demand data points, including:

可选地，所述选取模块根据线上对象需求数据点分类和线下地理环境需求数据点分类，迭代更新聚集数据点直至达到预设条件，以获取最终聚集数据点。Optionally, the selection module iteratively updates the aggregated data points according to the classification of the online object demand data points and the offline geographical environment demand data points until a preset condition is reached, so as to obtain the final aggregated data points.

可选地，所述选取模块根据线上对象需求数据点分类和线下地理环境需求数据点分类，迭代更新聚集数据点，包括：Optionally, the selection module iteratively updates the aggregated data points according to the classification of online object demand data points and the offline geographical environment demand data point classification, including:

其中：in:

N∈{A,B,C,D……}N∈{A,B,C,D…}

根据本发明实施例的另一个方面，还提供了一种电子设备，包括：According to another aspect of the embodiments of the present invention, an electronic device is also provided, including:

一个或多个处理器；one or more processors;

存储装置，用于存储一个或多个程序，storage means for storing one or more programs,

当所述一个或多个程序被所述一个或多个处理器执行，使得所述一个或多个处理器实现上述任一地址信息生成实施例所述的方法。When the one or more programs are executed by the one or more processors, the one or more processors implement the method described in any one of the above address information generation embodiments.

根据本发明实施例的另一个方面，还提供了一种计算机可读介质，其上存储有计算机程序，所述程序被处理器执行时实现上述任一地址信息生成实施例所述的方法。According to another aspect of the embodiments of the present invention, there is also provided a computer-readable medium on which a computer program is stored, and when the program is executed by a processor, implements the method described in any of the foregoing address information generation embodiments.

上述发明中的一个实施例具有如下优点或有益效果：因为采用了结合电商企业的线上订单需求数据以及线下地理环境数据，进行线下门店地址信息生成的技术手段，所以实现了线下门店的利益最大化，同时可以提升电商企业的品牌效应。An embodiment of the above invention has the following advantages or beneficial effects: because the technical means of generating offline store address information in combination with the online order demand data and offline geographic environment data of the e-commerce enterprise are adopted, the offline store address information generation is realized. The benefits of the store are maximized, and at the same time, the brand effect of the e-commerce enterprise can be enhanced.

上述的非惯用的可选方式所具有的进一步效果将在下文中结合具体实施方式加以说明。Further effects of the above non-conventional alternatives will be described below in conjunction with specific embodiments.

附图说明Description of drawings

附图用于更好地理解本发明，不构成对本发明的不当限定。其中：The accompanying drawings are used for better understanding of the present invention and do not constitute an improper limitation of the present invention. in:

图1是根据本发明实施例的地址信息生成方法的主要流程的示意图；1 is a schematic diagram of a main flow of a method for generating address information according to an embodiment of the present invention;

图2是根据本发明可参考实施例的地址信息生成方法的主要流程的示意图；2 is a schematic diagram of the main flow of a method for generating address information according to a referenced embodiment of the present invention;

图3是根据本发明实施例的地址信息生成装置的主要模块的示意图；3 is a schematic diagram of main modules of an apparatus for generating address information according to an embodiment of the present invention;

图4是本发明实施例可以应用于其中的示例性系统架构图；4 is an exemplary system architecture diagram to which an embodiment of the present invention may be applied;

图5是适于用来实现本发明实施例的终端设备或服务器的计算机系统的结构示意图。FIG. 5 is a schematic structural diagram of a computer system suitable for implementing a terminal device or a server according to an embodiment of the present invention.

具体实施方式Detailed ways

以下结合附图对本发明的示范性实施例做出说明，其中包括本发明实施例的各种细节以助于理解，应当将它们认为仅仅是示范性的。因此，本领域普通技术人员应当认识到，可以对这里描述的实施例做出各种改变和修改，而不会背离本发明的范围和精神。同样，为了清楚和简明，以下的描述中省略了对公知功能和结构的描述。Exemplary embodiments of the present invention are described below with reference to the accompanying drawings, which include various details of the embodiments of the present invention to facilitate understanding and should be considered as exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted from the following description for clarity and conciseness.

图1是根据本发明实施例的地址信息生成方法，如图1所示，所述地址信息生成方法包括：FIG. 1 is a method for generating address information according to an embodiment of the present invention. As shown in FIG. 1 , the method for generating address information includes:

步骤S101，获取线上对象需求数据和线下地理环境数据。Step S101, acquiring online object demand data and offline geographic environment data.

其中，所述的线上对象需求数据包括线上对象的订单数据、以及该订单数据对应的经纬度坐标信息即线上对象需求数据点。The online object demand data includes the order data of the online object, and the latitude and longitude coordinate information corresponding to the order data, that is, the online object demand data point.

步骤S102，利用聚类算法分别对所述线上对象需求数据和所述线下地理环境数据进行聚合，以获得初始的聚集数据点。Step S102, using a clustering algorithm to respectively aggregate the online object demand data and the offline geographic environment data to obtain initial aggregated data points.

其中，初始化聚类算法，以获取初始的聚集数据点。Among them, the clustering algorithm is initialized to obtain the initial clustered data points.

步骤S103，根据所述初始的聚集数据点，确定最终聚集数据点。具体的实施过程包括：Step S103: Determine a final aggregated data point according to the initial aggregated data point. The specific implementation process includes:

其中，线上对象需求数据包括线上对象需求数据点。The online object demand data includes online object demand data points.

步骤一：计算线上对象需求数据点与初始的聚集数据点的距离，以确定所述线上对象需求数据点的分类；以及计算线下地理环境需求数据点与聚集数据点的距离，以确定所述线下地理环境需求数据点的分类。Step 1: Calculate the distance between the online object demand data point and the initial aggregated data point to determine the classification of the online object demand data point; and calculate the distance between the offline geographic environment demand data point and the aggregated data point to determine Classification of the offline geographic environment demand data points.

较佳地，在执行步骤一时，可以通过计算每个线上对象需求数据点与初始的聚集数据点的距离；根据所述距离，计算加权距离，以确定与线上对象需求数据点距离最近的聚集数据点作为所述线上对象需求数据点的分类。Preferably, when step 1 is performed, the distance between each online object demand data point and the initial aggregated data point can be calculated; according to the distance, the weighted distance is calculated to determine the closest distance to the online object demand data point. Aggregate data points as a classification of the online object demand data points.

同理，可以通过计算每个线下地理环境需求数据点与初始的聚集数据点的距离；根据所述距离，计算加权距离，以确定与线下地理环境需求数据点距离最近的聚集数据点作为所述线下地理环境需求数据点的分类。Similarly, the distance between each offline geographic environment demand data point and the initial aggregated data point can be calculated; according to the distance, the weighted distance can be calculated to determine the aggregated data point closest to the offline geographic environment demand data point as Classification of the offline geographic environment demand data points.

步骤二：根据线上对象需求数据点分类和线下地理环境需求数据点分类，以获取最终聚集数据点。Step 2: According to the classification of online object demand data points and the offline geographical environment demand data point classification, to obtain final aggregated data points.

较佳地，对于每个线上对象需求数据点分类的位置信息，计算预设半径以内的所有线上对象需求数据点的质心；同样，对于每一个线下地理环境需求数据点分类的位置信息，计算预设半径以内的所有线下地理环境需求数据点的质心；Preferably, for the classified location information of each online object demand data point, the centroid of all online object demand data points within a preset radius is calculated; similarly, for each offline geographic environment demand data point classified location information , calculate the centroid of all offline geographic environment demand data points within the preset radius;

然后根据所有线上对象需求数据点的质心和所有线下地理环境需求数据点的质心，以获取最终聚集数据点。Then according to the centroid of all online object demand data points and the centroid of all offline geographic environment demand data points, the final aggregated data points are obtained.

另一个较佳地实施例，根据线上对象需求数据点分类和线下地理环境需求数据点分类，迭代更新聚集数据点直至达到预设条件，以获取最终聚集数据点。具体的实施过程包括：In another preferred embodiment, according to the classification of online object demand data points and the classification of offline geographical environment demand data points, the aggregated data points are iteratively updated until a preset condition is reached, so as to obtain the final aggregated data points. The specific implementation process includes:

在执行步骤三时，可以对于每个线上对象需求数据点分类的位置信息，计算预设半径以内的所有线上对象需求数据点的质心A₁′(x_a1′,y_a1′)、B₁′(x_b1′,y_b1′)、C₁′(x_c1′,y_c1′)、D₁′(x_d1′,y_d1′)……；同样，对于每一个线下地理环境需求数据点分类的位置信息，计算预设半径以内的所有线下地理环境需求数据点的质心A₁″(x_a1″,y_a1″)、B₁″(x_b1″,y_b1″)、C₁″(x_c1″,y_c1″)、D₁″(x_d1″,y_d1″)……；When step 3 is performed, the centroids A ₁ ′(x _a1 ′, y _a1 ′), B ₁ ′(x _b1 ′,y _b1 ′), C ₁ ′(x _c1 ′,y _c1 ′), D ₁ ′(x _d1 ′,y _d1 ′)…; Similarly, for each offline geographical environment requirement The location information of the data point classification, calculate the centroids of all offline geographic environment demand data points within the preset radius A ₁ ″(x _a1 ″,y _a1 ″), B ₁ ″(x _b1 ″,y _b1 ″), C ₁ ″(x _c1 ″,y _c1 ″), D ₁ ″(x _d1 ″,y _d1 ″)…;

其中：in:

N∈{A,B,C,D……}N∈{A,B,C,D…}

进一步地，所述预设条件为：更新后聚集数据点与更新前聚集数据点的距离小于误差距离，或者循环次数大于或等于设定的循环最大次数阈值。Further, the preset condition is: the distance between the aggregated data points after the update and the aggregated data points before the update is less than the error distance, or the number of cycles is greater than or equal to the set threshold for the maximum number of cycles.

根据上面的各种实施例，可以看出所述的地址信息生成方法，可以通过线上商品需求数据的聚合，需找最佳商品聚集点。并且，结合线下地理环境信息，对于最佳门店位置进行调整。另外，本发明还提供了人工选址的余量，通过调整参数，可以使地址信息生成技术最后确定的门店总数略多于最终方案，最后再通过考虑社会因素，进行最终门店的筛选。According to the above various embodiments, it can be seen that the method for generating address information can find the best product aggregation point through the aggregation of online product demand data. And, combined with offline geographic environment information, adjust the best store location. In addition, the present invention also provides a margin for manual location selection. By adjusting the parameters, the total number of stores finally determined by the address information generation technology can be slightly more than the final plan, and finally the final store selection is performed by considering social factors.

图2是根据本发明可参考实施例的地址信息生成方法的主要流程的示意图，所述地址信息生成方法可以包括：2 is a schematic diagram of the main flow of a method for generating address information according to a referenced embodiment of the present invention, and the method for generating address information may include:

步骤S201，获取线上对象需求数据和线下地理环境数据。Step S201, acquiring online object demand data and offline geographic environment data.

在实施例中，可以采集线上对象的订单数据，以通过线上订单数据反映对象的需求情况。较佳地，也可以对线上对象进行一下选择，将选择的对象作为可以进行线下门店售卖的对象。In an embodiment, the order data of the online object may be collected, so as to reflect the demand situation of the object through the online order data. Preferably, online objects can also be selected, and the selected objects can be used as objects that can be sold in offline stores.

进一步地，获取对象在选取的门店选址区域的经纬度坐标(即对象在门店选址区域内总销售量对应的经纬度坐标)，同时统计预设时期内在该区域每个对象的线上总销售量。其中，所述门店选址区域的经纬度坐标精确到小数点后三位。例如：可以参见表1，其中每个对象(在该实施例中对象表示为商品)表示为一个数据行。将每一行作为一个数据点，对于第i个数据点，其中经度为x_i，纬度为y_i，商品总利润(可以是单位利润乘以总销量数)为w_i，较佳地可以去掉总利润为0的数据点。Further, obtain the latitude and longitude coordinates of the object in the selected store location area (that is, the longitude and latitude coordinates corresponding to the total sales volume of the object in the store location area), and simultaneously count the total online sales volume of each object in this area within the preset period. . Wherein, the latitude and longitude coordinates of the store location area are accurate to three decimal places. For example, see Table 1, where each object (in this embodiment the object is represented as a commodity) is represented as a data row. Taking each row as a data point, for the i-th data point, where the longitude is x _i , the latitude is y _i , and the total profit of the product (which can be the unit profit multiplied by the total number of sales) is _wi , preferably, the total profit can be removed. A data point with a profit of 0.

另外，商品采用SKU表示，而SKU全称为Stock Keeping Unit(库存量单位)，即库存进出计量的单位，可以是以件、盒、托盘等为单位。SKU这是对于大型连锁超市或者配送中心物流管理的一个必要的方法。现在已经被引申为商品统一编号的简称，每种商品均对应有唯一的SKU号。In addition, the commodity is represented by SKU, and the full name of SKU is Stock Keeping Unit (stock keeping unit), that is, the unit of stock in and out measurement, which can be in units, boxes, trays, etc. SKU is a necessary method for logistics management of large supermarket chains or distribution centers. It has now been extended to the abbreviation of the unified product number, and each product has a unique SKU number.

例如：需要在北京地区进行智能门店的选址，通过历史数据查询，统计近两年内北京地区的所有相关对象的订单数据，可以得出以下历史商品需求统计表。For example, it is necessary to select the location of smart stores in Beijing area. Through historical data query, the order data of all related objects in Beijing area in the past two years can be counted, and the following historical commodity demand statistics table can be obtained.

需要说明的是，统计时经纬度精度越高，那么同种对象越不容易聚集，用于辅助地址信息生成的订单数据量越大；经纬度的精度越低，数据量越小。但是不一定数据量大最后算法的结果就会好。优选地，选择经纬度坐标小数点后三位，即半径约为0.15KM的订单数据会被认为是同一位置的数据。It should be noted that the higher the accuracy of latitude and longitude during statistics, the less easy it is for the same type of objects to be aggregated, and the larger the amount of order data used to generate auxiliary address information; the lower the accuracy of latitude and longitude, the smaller the amount of data. However, it is not necessarily that the result of the final algorithm will be good if the amount of data is large. Preferably, the latitude and longitude coordinates are selected with three decimal places, that is, the order data with a radius of about 0.15KM will be regarded as the data of the same location.

表1历史商品需求数据统计表Table 1 Statistics of historical commodity demand data

商品SKUProduct SKU 经度(xi)Longitude (xi) 纬度(yi)Latitude (yi) 总利润(wi)Total Profit (wi) 1000000110000001 116.353116.353 39.98339.983 55 1000000110000001 116.345116.345 40.66240.662 33 1000000210000002 116.125116.125 39.11239.112 88 …… …… …… …… 8924592289245922 116.349116.349 39.02439.024 11

另外，获取的线下地理环境数据即门店地址区域中的客流量相对较大的信息数据，其中所述的信息数据包括经度、纬度以及根据客流量设置的每个坐标的客流权重。In addition, the obtained offline geographic environment data is information data with relatively large passenger flow in the store address area, wherein the information data includes longitude, latitude and the passenger flow weight of each coordinate set according to the passenger flow.

例如地铁信息、商圈信息以及住宅区信息等客流量较大的区域，更细化的例如地铁站(如表2所示：每个地铁站，通过线上地图API获取经纬度坐标，并依据各地铁站的客流量，设置不同的权重v_j。)、公交站、小区等。For example, subway information, business district information, residential area information and other areas with large passenger flow, more detailed such as subway stations (as shown in Table 2: for each subway station, the latitude and longitude coordinates are obtained through the online map API, and according to each subway station) The passenger flow of the subway station, set different weights v _j .), bus station, community and so on.

表2地铁分布数据统计表Table 2 Statistics of subway distribution data

地铁站序号Subway station serial number 地铁站名称Subway station name 经度(xj)Longitude (xj) 纬度(yj)Latitude (yj) 客流权重(vj)Passenger flow weight (vj) 11 经海路via sea 116.569116.569 39.78939.789 55 22 肖村Xiaocun 116.454116.454 39.84039.840 11 33 五道口Wudaokou 116.344116.344 39.99839.998 1010 …… …… …… …… ……

步骤S202，初始化k-means算法，以获得初始聚集数据点。Step S202, initialize the k-means algorithm to obtain initial aggregated data points.

其中，所述k-means算法，即k-means聚类算法是典型的基于原型的目标函数聚类方法的代表，它是数据点到原型的某种距离作为优化的目标函数，利用函数求极值的方法得到迭代运算的调整规则。Among them, the k-means algorithm, that is, the k-means clustering algorithm is a representative of a typical prototype-based objective function clustering method, which is a certain distance from the data point to the prototype as the optimized objective function, and uses the function to find the extreme. The value method gets the adjustment rules for the iterative operation.

作为实施例，随意选取k个初始聚集数据点即从线下地理环境数据点中选取k个数据点作为初始聚集数据点，进一步地，可以设置：As an embodiment, randomly select k initial aggregation data points, that is, select k data points from offline geographical environment data points as initial aggregation data points, and further, can set:

k需要选择的智能门店个数m调整参数ukThe number of smart stores to be selected mAdjust parameter u

其中：u为调整参数，目的为选出m+u个智能线下门店，并在所有选出的门店中，再综合考虑各方面外部因素(例如：热门商圈、热门购物中心可以增加企业品牌效应)。u＝0时，表示不做调整，完全信赖k-means算法进行地址信息生成。Among them: u is the adjustment parameter, the purpose is to select m+u smart offline stores, and in all the selected stores, various external factors are considered comprehensively (for example: popular business districts, popular shopping centers can increase corporate branding effect). When u=0, it means that no adjustment is made, and the k-means algorithm is completely trusted to generate address information.

在上述例子中，假设需要选择2个门店，设置u＝2,则选择A₀(x_a0,y_a0)、B₀(x_b0,y_b0)、C₀(x_c0,y_c0)、D₀(x_d0,y_d0)四个点为初始聚集数据点。In the above example, suppose you need to select 2 stores, set u=2, then select A ₀ (x _a0 , y _a0 ), B ₀ (x _b0 , y _b0 ), C ₀ (x _c0 , y _c0 ), D The four points ₀ (x _d0 , y _d0 ) are the initial aggregated data points.

较佳地，为了使算法快速收敛，也可以先进行历史对象需求点绘图，在需求点聚集较为密的区域选择初始聚集数据点。Preferably, in order to make the algorithm converge quickly, it is also possible to first draw the demand points of historical objects, and select the initial aggregation data points in the area where the demand points are densely clustered.

步骤S203，计算线上对象需求数据点与初始聚集数据点的距离，以确定所述线上对象需求数据点的分类。Step S203: Calculate the distance between the online object demand data point and the initial aggregated data point to determine the classification of the online object demand data point.

作为实施例，对于每个对象需求数据点i通过经纬度，计算每个对象需求数据点与选择的初始聚集数据点的初始距离RA_i0、RB_i0、RC_i0、RD_i0……。并根据初始距离，计算初始加权距离RA_i1、RB_i1、RC_i1、RD_i1……：As an example, for each object demand data point i, the initial distances RA _i0 , RB _i0 , RC _i0 , RD _i0 . And according to the initial distance, calculate the initial weighted distance RA _i1 , RB _i1 , RC _i1 , RD _{i1 ......} :

RA_i1＝RA_i0/1+α×w_i)RA _i1 =RA _i0 /1+α× _wi )

其中参数α可以调节，α越大，地址信息生成时对于利润越敏感，α的范围为(0,+∞)，优选地，可以选择α略大于1。w_i为总利润。The parameter α can be adjusted. The larger the α, the more sensitive the address information is to the profit. The range of α is (0, +∞). Preferably, α can be selected to be slightly larger than 1. w _i is the total profit.

根据计算出的初始加权距离RA_i1、RB_i1、RC_i1、RD_i1……，选取与对象需求数据点i距离最近的初始聚集数据点为该对象需求数据点i的初始分类，例如，对于对象需求数据点i，计算得出RB_i1为最小值，则此商品需求数据点分类为B。According to the calculated initial weighted distances RA _i1 , RB _i1 , RC _i1 , RD _i1 . Demand data point i, calculated that RB _i1 is the minimum value, then this commodity demand data point is classified as B.

步骤S204，计算线下地理环境需求数据点与初始聚集数据点的距离，以确定所述线下地理环境需求数据点的分类。Step S204: Calculate the distance between the offline geographic environment demand data points and the initial aggregated data points to determine the classification of the offline geographic environment demand data points.

作为实施例，利用与步骤S203的算法，对于每个线下地理环境需求数据点j，通过经纬度，和客流权重(v_i)计算该线下地理环境需求数据点j与选择的初始聚集点的初始加权距离RA_j1、RB_j1、RC_j1、RD_j1……。并根据初始加权距离，同样对于每个线下地理环境需求点j，依据步骤S203的分类方法，计算每个线下地理环境需求点j的初始分类。As an example, using the algorithm of step S203, for each offline geographic environment demand data point j, the distance between the offline geographic environment demand data point j and the selected initial aggregation point is calculated through the latitude and longitude, and the passenger flow weight (v _i ). Initial weighted distances RA _j1 , RB _j1 , RC _j1 , RD _j1 . . . And according to the initial weighted distance, also for each offline geographic environment demand point j, according to the classification method of step S203, the initial classification of each offline geographic environment demand point j is calculated.

步骤S205，根据线上对象需求数据点分类和线下地理环境需求数据点分类，迭代更新聚集数据点直至达到预设的条件，以获得最终聚集数据点。Step S205 , according to the classification of online object demand data points and the classification of offline geographical environment demand data points, iteratively update the aggregated data points until a preset condition is reached, so as to obtain the final aggregated data points.

作为实施例，对于每个线上对象需求数据点分类的位置信息，计算半径为R以内的所有线上对象需求数据点的质心(根据坐标取平均值)，确定为A₁′(x_a1′,y_a1′)、B₁′(x_b1′,y_b1′)、C₁′(x_c1′,y_c1′)、D₁′(x_d1′,y_d1′)。同样，对于每一个线下地理环境需求数据点分类的位置信息，也是计算质心A₁″(x_a1″,y_a1″)、B₁″(x_b1″,y_b1″)、C₁″(x_c1″,y_c1″)、D₁″(x_d1″,y_d1″)。As an example, for the classified position information of each online object demand data point, the centroid of all online object demand data points within the radius R is calculated (the average value is taken according to the coordinates), and it is determined as A ₁ ′(x _a1 ′ , y _a1 ′), B ₁ ′ (x _b1 ′, y _b1 ′), C ₁ ′ (x _c1 ′, y _c1 ′), D ₁ ′ (x _d1 ′, y _d1 ′). Similarly, for the location information of each offline geographic environment demand data point classification, the centroid A ₁ ″(x _a1 ″,y _a1 ″), B ₁ ″(x _b1 ″,y _b1 ″), C ₁ ″( x _c1 ″, y _c1 ″), D ₁ ″ (x _d1 ″, y _d1 ″).

其中，对于R的选择：可以先对线上对象需求数据点或线下地理环境需要数据点进行绘图，然后结合所述绘图和实际情况确定R的取值。优选地，可以初始时选择为5km，即设定此门店的受众范围为5km以内的客户，后续可通过实体店开业后积累的数据，调整R，再计算新门店地址信息。Among them, for the selection of R: you can first draw the online object demand data points or the offline geographical environment demand data points, and then determine the value of R in combination with the drawing and the actual situation. Preferably, it can be initially selected as 5km, that is, the audience range of this store is set to be within 5km of customers. Subsequently, R can be adjusted based on the data accumulated after the opening of the physical store, and then the address information of the new store can be calculated.

然后，确定下一轮迭代的聚集数据点：Then, determine the aggregated data points for the next iteration:

N₁(x,y)＝[N′₁(x′_n1,y′_n1)+β×N″₁(x″_n1,y″_n1）]÷2N ₁ (x,y)＝[N′ ₁ (x′ _n1 ,y′ _n1 )+β×N″ ₁ (x″ _n1 ,y″ _n1 )]÷2

其中：in:

N∈{A,B,C,D……}N∈{A,B,C,D…}

β可以等于1、略大于1或略小于1，其中：β can be equal to 1, slightly greater than 1, or slightly less than 1, where:

β＞1时：线下地理环境需求数据点的影响大于线上对象需求数据点的影响。When β>1: the influence of offline geographic environment demand data points is greater than that of online object demand data points.

β＜1时：线下地理环境需求数据点的影响小于线上对象需求数据点的影响。When β<1: the influence of offline geographic environment demand data points is smaller than that of online object demand data points.

最后，根据计算出的新的聚集数据点A₁,B₁,C₁,D₁……，返回步骤S203再次循环迭代，计算出A₂,B₂,C₂,D₂……。当循环迭代优化，达到预设条件时则可以结束循环，获得最终确定的聚集数据点即最后选取的线下门店位置。其中，所述的优化条件可以为新计算出的聚集数据点与旧的聚集数据点(前一轮迭代出的数据聚集点)距离小于误差距离S时，或者循环次数大于或等于设定的循环最大次数阈值T。Finally, according to the calculated new _aggregated data _points _A1 , _B1 , _C1 _, _D1 _... When the loop is iteratively optimized and the preset condition is reached, the loop can be ended, and the finalized aggregated data point, that is, the last selected offline store location, can be obtained. Wherein, the optimization condition can be when the distance between the newly calculated aggregated data point and the old aggregated data point (the data aggregated point from the previous iteration) is less than the error distance S, or the number of cycles is greater than or equal to the set cycle The maximum number of times threshold T.

具体地，循环停止条件为：Specifically, the loop stop condition is:

distance(N_q,N_q-1)＜S或者q≥Tdistance(N _q , N _q-1 )<S or q≥T

q:循环次数q: number of cycles

S：提前设定的误差距离S: Error distance set in advance

T：提前设定的最大循环次数阈值T: The maximum cycle number threshold set in advance

步骤S206，根据最终确定的聚集数据点，以获得选取的地址。Step S206, obtain the selected address according to the finally determined aggregated data points.

在实施例中，通过步骤S205最后获得k个线下门店即最终确定的聚集数据点，然后根据调整参数u最终确定需要选择的智能门店个数m。在上述具体实施例中，从最终通过k-means算法得出的4个门店地址中，综合各类因素，最终选取2个门店。In the embodiment, k offline stores are finally obtained through step S205, that is, the finally determined aggregated data points, and then the number m of smart stores to be selected is finally determined according to the adjustment parameter u. In the above specific embodiment, from the four store addresses finally obtained by the k-means algorithm, various factors are combined, and two stores are finally selected.

另外，在本发明可参考实施例中所述地址信息生成方法的具体实施内容，在上面所述地址信息生成方法中已经详细说明了，故在此重复内容不再说明。In addition, in the present invention, reference may be made to the specific implementation content of the address information generation method described in the embodiments, which has been described in detail in the address information generation method described above, so the repeated content will not be described here.

图3是根据本发明实施例的地址信息生成装置，如图3所示，所述地址信息生成装置300包括获取模块301、计算模块302以及选取模块303。其中，获取模块301获取线上对象需求数据和线下地理环境数据。然后，计算模块302利用聚类算法分别对所述线上对象需求数据和所述线下地理环境数据进行聚合，以获得初始的聚集数据点。最后，选取模块303根据所述初始的聚集数据点，确定最终聚集数据点。FIG. 3 is an apparatus for generating address information according to an embodiment of the present invention. As shown in FIG. 3 , the apparatus 300 for generating address information includes an acquisition module 301 , a calculation module 302 , and a selection module 303 . The acquisition module 301 acquires online object demand data and offline geographic environment data. Then, the computing module 302 uses a clustering algorithm to respectively aggregate the online object demand data and the offline geographic environment data to obtain initial aggregated data points. Finally, the selection module 303 determines the final aggregated data points according to the initial aggregated data points.

作为一个较佳地的实施例，选取模块303根据所述初始的聚集数据点，确定最终聚集数据点，以获得选取的地址。具体实施过程包括：As a preferred embodiment, the selection module 303 determines the final aggregated data point according to the initial aggregated data point to obtain the selected address. The specific implementation process includes:

其中：in:

N∈{A,B,C,D……}N∈{A,B,C,D…}

需要说明的是，在本发明所述地址信息生成装置的具体实施内容，在上面所述地址信息生成方法中已经详细说明了，故在此重复内容不再说明。It should be noted that, the specific implementation content of the address information generating apparatus of the present invention has been described in detail in the above-mentioned address information generating method, so the repeated content will not be described here.

图4示出了可以应用本发明实施例的地址信息生成方法或地址信息生成装置的示例性系统架构400。或者图4示出了可以应用本发明实施例的地址信息生成方法或地址信息生成装置的示例性系统架构400。FIG. 4 shows an exemplary system architecture 400 to which an address information generating method or an address information generating apparatus according to an embodiment of the present invention may be applied. Alternatively, FIG. 4 shows an exemplary system architecture 400 to which an address information generating method or an address information generating apparatus according to an embodiment of the present invention may be applied.

如图4所示，系统架构400可以包括终端设备401、402、403，网络404和服务器405。网络404用以在终端设备401、402、403和服务器405之间提供通信链路的介质。网络404可以包括各种连接类型，例如有线、无线通信链路或者光纤电缆等等。As shown in FIG. 4 , the system architecture 400 may include terminal devices 401 , 402 , and 403 , a network 404 and a server 405 . The network 404 is a medium used to provide a communication link between the terminal devices 401 , 402 , 403 and the server 405 . The network 404 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.

用户可以使用终端设备401、402、403通过网络404与服务器405交互，以接收或发送消息等。终端设备401、402、403上可以安装有各种通讯客户端应用，例如购物类应用、网页浏览器应用、搜索类应用、即时通信工具、邮箱客户端、社交平台软件等(仅为示例)。The user can use the terminal devices 401, 402, 403 to interact with the server 405 through the network 404 to receive or send messages and the like. Various communication client applications may be installed on the terminal devices 401 , 402 and 403 , such as shopping applications, web browser applications, search applications, instant messaging tools, email clients, social platform software, etc. (only examples).

终端设备401、402、403可以是具有显示屏并且支持网页浏览的各种电子设备，包括但不限于智能手机、平板电脑、膝上型便携计算机和台式计算机等等。The terminal devices 401, 402, 403 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, laptop computers, desktop computers, and the like.

服务器405可以是提供各种服务的服务器，例如对用户利用终端设备401、402、403所浏览的购物类网站提供支持的后台管理服务器(仅为示例)。后台管理服务器可以对接收到的产品信息查询请求等数据进行分析等处理，并将处理结果(例如目标推送信息、产品信息--仅为示例)反馈给终端设备。The server 405 may be a server that provides various services, for example, a background management server that provides support for shopping websites browsed by the terminal devices 401 , 402 , and 403 (just an example). The background management server can analyze and process the received product information query request and other data, and feed back the processing results (such as target push information, product information—just an example) to the terminal device.

需要说明的是，本发明实施例所提供的地址信息生成方法一般由服务器405执行，相应地，地址信息生成装置一般设置于服务器405中。It should be noted that the address information generating method provided by the embodiment of the present invention is generally executed by the server 405 , and accordingly, the address information generating apparatus is generally set in the server 405 .

应该理解，图4中的终端设备、网络和服务器的数目仅仅是示意性的。根据实现需要，可以具有任意数目的终端设备、网络和服务器。It should be understood that the numbers of terminal devices, networks and servers in FIG. 4 are merely illustrative. There can be any number of terminal devices, networks and servers according to implementation needs.

下面参考图5，其示出了适于用来实现本发明实施例的终端设备的计算机系统500的结构示意图。图5示出的终端设备仅仅是一个示例，不应对本发明实施例的功能和使用范围带来任何限制。Referring to FIG. 5 below, it shows a schematic structural diagram of a computer system 500 suitable for implementing a terminal device according to an embodiment of the present invention. The terminal device shown in FIG. 5 is only an example, and should not impose any limitations on the functions and scope of use of the embodiments of the present invention.

如图5所示，计算机系统500包括中央处理单元(CPU)501，其可以根据存储在只读存储器(ROM)502中的程序或者从存储部分508加载到随机访问存储器(RAM)503中的程序而执行各种适当的动作和处理。在RAM503中，还存储有系统500操作所需的各种程序和数据。CPU501、ROM 502以及RAM503通过总线504彼此相连。输入/输出(I/O)接口505也连接至总线504。As shown in FIG. 5, a computer system 500 includes a central processing unit (CPU) 501 which can be loaded into a random access memory (RAM) 503 according to a program stored in a read only memory (ROM) 502 or a program from a storage section 508 Instead, various appropriate actions and processes are performed. In the RAM 503, various programs and data necessary for the operation of the system 500 are also stored. The CPU 501 , the ROM 502 and the RAM 503 are connected to each other through a bus 504 . An input/output (I/O) interface 505 is also connected to bus 504 .

以下部件连接至I/O接口505：包括键盘、鼠标等的输入部分506；包括诸如阴极射线管(CRT)、液晶显示器(LCD)等以及扬声器等的输出部分507；包括硬盘等的存储部分508；以及包括诸如LAN卡、调制解调器等的网络接口卡的通信部分509。通信部分509经由诸如因特网的网络执行通信处理。驱动器510也根据需要连接至I/O接口505。可拆卸介质511，诸如磁盘、光盘、磁光盘、半导体存储器等等，根据需要安装在驱动器510上，以便于从其上读出的计算机程序根据需要被安装入存储部分508。The following components are connected to the I/O interface 505: an input section 506 including a keyboard, a mouse, etc.; an output section 507 including a cathode ray tube (CRT), a liquid crystal display (LCD), etc., and a speaker, etc.; a storage section 508 including a hard disk, etc. ; and a communication section 509 including a network interface card such as a LAN card, a modem, and the like. The communication section 509 performs communication processing via a network such as the Internet. A drive 510 is also connected to the I/O interface 505 as needed. A removable medium 511, such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, etc., is mounted on the drive 510 as needed so that a computer program read therefrom is installed into the storage section 508 as needed.

特别地，根据本发明公开的实施例，上文参考流程图描述的过程可以被实现为计算机软件程序。例如，本发明公开的实施例包括一种计算机程序产品，其包括承载在计算机可读介质上的计算机程序，该计算机程序包含用于执行流程图所示的方法的程序代码。在这样的实施例中，该计算机程序可以通过通信部分509从网络上被下载和安装，和/或从可拆卸介质511被安装。在该计算机程序被中央处理单元(CPU)501执行时，执行本发明的系统中限定的上述功能。In particular, the processes described above with reference to the flowcharts may be implemented as computer software programs in accordance with the disclosed embodiments of the present invention. For example, embodiments disclosed herein include a computer program product comprising a computer program carried on a computer-readable medium, the computer program containing program code for performing the method illustrated in the flowchart. In such an embodiment, the computer program may be downloaded and installed from the network via the communication portion 509 and/or installed from the removable medium 511 . When the computer program is executed by the central processing unit (CPU) 501, the above-described functions defined in the system of the present invention are performed.

需要说明的是，本发明所示的计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件，或者任意以上的组合。计算机可读存储介质的更具体的例子可以包括但不限于：具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本发明中，计算机可读存储介质可以是任何包含或存储程序的有形介质，该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。而在本发明中，计算机可读的信号介质可以包括在基带中或者作为载波一部分传播的数据信号，其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式，包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读的信号介质还可以是计算机可读存储介质以外的任何计算机可读介质，该计算机可读介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。计算机可读介质上包含的程序代码可以用任何适当的介质传输，包括但不限于：无线、电线、光缆、RF等等，或者上述的任意合适的组合。It should be noted that the computer-readable medium shown in the present invention may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the above two. The computer-readable storage medium can be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or a combination of any of the above. More specific examples of computer readable storage media may include, but are not limited to, electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable Programmable read only memory (EPROM or flash memory), fiber optics, portable compact disk read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing. In the present invention, a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device. In the present invention, however, a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, carrying computer-readable program code therein. Such propagated data signals may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing. A computer-readable signal medium can also be any computer-readable medium other than a computer-readable storage medium that can transmit, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device . Program code embodied on a computer readable medium may be transmitted using any suitable medium including, but not limited to, wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

附图中的流程图和框图，图示了按照本发明各种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上，流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分，上述模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意，在有些作为替换的实现中，方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如，两个接连地表示的方框实际上可以基本并行地执行，它们有时也可以按相反的顺序执行，这依所涉及的功能而定。也要注意的是，框图或流程图中的每个方框、以及框图或流程图中的方框的组合，可以用执行规定的功能或操作的专用的基于硬件的系统来实现，或者可以用专用硬件与计算机指令的组合来实现。The flowchart and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code that contains one or more logical functions for implementing the specified functions executable instructions. It should also be noted that, in some alternative implementations, the functions noted in the blocks may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It is also noted that each block of the block diagrams or flowchart illustrations, and combinations of blocks in the block diagrams or flowchart illustrations, can be implemented in special purpose hardware-based systems that perform the specified functions or operations, or can be implemented using A combination of dedicated hardware and computer instructions is implemented.

描述于本发明实施例中所涉及到的模块可以通过软件的方式实现，也可以通过硬件的方式来实现。所描述的模块也可以设置在处理器中，例如，可以描述为：一种处理器包括获取模块、计算模块以及选取模块。其中，这些模块的名称在某种情况下并不构成对该模块本身的限定。The modules involved in the embodiments of the present invention may be implemented in a software manner, and may also be implemented in a hardware manner. The described modules can also be set in the processor, for example, it can be described as: a processor includes an acquisition module, a calculation module and a selection module. Among them, the names of these modules do not constitute a limitation on the module itself under certain circumstances.

作为另一方面，本发明还提供了一种计算机可读介质，该计算机可读介质可以是上述实施例中描述的设备中所包含的；也可以是单独存在，而未装配入该设备中。上述计算机可读介质承载有一个或者多个程序，当上述一个或者多个程序被一个该设备执行时，使得该设备包括：获取线上对象需求数据和线下地理环境数据；利用聚类算法分别对所述线上对象需求数据和所述线下地理环境数据进行聚合，以获得初始的聚集数据点；根据所述初始的聚集数据点，确定最终聚集数据点。As another aspect, the present invention also provides a computer-readable medium, which may be included in the device described in the above embodiments; or may exist alone without being assembled into the device. The above-mentioned computer-readable medium carries one or more programs, and when the above-mentioned one or more programs are executed by a device, the device includes: acquiring online object demand data and offline geographic environment data; using a clustering algorithm to separate Aggregate the online object demand data and the offline geographic environment data to obtain initial aggregated data points; and determine final aggregated data points according to the initial aggregated data points.

根据本发明实施例的技术方案，因为采用了结合电商企业的线上订单需求数据以及线下地理环境数据，进行线下门店地址信息生成的技术手段，所以实现了线下门店的利益最大化，同时可以提升电商企业的品牌效应。。According to the technical solution of the embodiment of the present invention, because the technical means of generating offline store address information in combination with the online order demand data and offline geographical environment data of the e-commerce enterprise are adopted, the benefits of the offline stores are maximized. , and at the same time can enhance the brand effect of e-commerce enterprises. .

上述具体实施方式，并不构成对本发明保护范围的限制。本领域技术人员应该明白的是，取决于设计要求和其他因素，可以发生各种各样的修改、组合、子组合和替代。任何在本发明的精神和原则之内所作的修改、等同替换和改进等，均应包含在本发明保护范围之内。The above-mentioned specific embodiments do not constitute a limitation on the protection scope of the present invention. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may occur depending on design requirements and other factors. Any modifications, equivalent replacements and improvements made within the spirit and principle of the present invention shall be included within the protection scope of the present invention.

Claims

1. A method for generating address information, comprising:

Obtain online object demand data and offline geographic environment data;

Using a clustering algorithm to respectively aggregate the online object demand data and the offline geographic environment data to obtain initial aggregated data points;

Based on the initial aggregated data points, final aggregated data points are determined.

2. The method according to claim 1, wherein determining the final aggregated data point according to the initial aggregated data point comprises:

Calculate the distance between the online object demand data point and the initial aggregated data point to determine the classification of the online object demand data point; and calculate the distance between the offline geographic environment demand data point and the aggregated data point to determine the line The classification of data points of geographical environment requirements; wherein, online object demand data includes online object demand data points;

According to the classification of online object demand data points and the offline geographical environment demand data point classification, the final aggregated data points are obtained.

3. The method according to claim 2, wherein calculating the distance between the online object demand data points and the initial aggregated data points to determine the classification of the online object demand data points, comprising:

Calculate the distance between each online object demand data point and the initial aggregated data point;

According to the distance, the weighted distance is calculated to determine the aggregation data point closest to the online object demand data point as the classification of the online object demand data point;

In addition, the distance between the offline geographic environment demand data points and the aggregated data points is calculated to determine the classification of the offline geographic environment demand data points, including:

Calculate the distance between each offline geographic environment demand data point and the initial aggregated data point;

According to the distance, a weighted distance is calculated to determine an aggregated data point closest to the offline geographic environment demand data point as a classification of the offline geographic environment demand data point.

4. The method according to claim 2, wherein the classification according to the online object demand data point and the offline geographical environment demand data point classification to obtain the final aggregated data point, comprising:

For the location information of each online object demand data point classification, calculate the centroid of all online object demand data points within the preset radius; similarly, for the location information of each offline geographic environment demand data point classification, calculate the preset The centroid of all offline geographic environment demand data points within the radius;

Based on the centroid of all online object demand data points and the centroid of all offline geographic environment demand data points, the final aggregated data points are obtained.

5. The method according to claim 2, characterized in that, according to the classification of online object demand data points and the classification of offline geographical environment demand data points, iteratively update the aggregated data points until a preset condition is reached, so as to obtain the final aggregated data points .

6. The method according to claim 5, wherein, according to the classification of online object demand data points and the classification of offline geographical environment demand data points, iteratively update the aggregated data points, comprising:

For the classified location information of each online object demand data point, calculate the centroids A ₁ ′(x _a1 ′,y _a1 ′), B ₁ ′(x b1 ′, B 1 ′(x _b1 ′, y _b1 ′), C ₁ ′ (x _c1 ′, y _c1 ′), D ₁ ′ (x _d1 ′, y _d1 ′)...; Similarly, for the location information of each offline geographic environment demand data point classification, Calculate the centroids A ₁ ″(x _a1 ″,y _a1 ″), B ₁ ″(x _b1 ″,y _b1 ″), C ₁ ″(x _c1 ″), C 1 ″(x c1 ″) of all offline geographic environment demand data points within the preset radius, y _c1 ″), D ₁ ″(x _d1 ″, y _d1 ″)…;

Determine the aggregated data points for the next iteration:

N ₁ (x,y)＝[N′ ₁ (x′ _n1 ,y′ _n1 )+β×N″ ₁ (x″ _n1 ,y″ _n1 )]÷2

in:

N∈{A,B,C,D…}

When β>1: the influence of offline geographical environment demand data points is greater than the influence of online object demand data points;

When β<1: the influence of offline geographical environment demand data points is less than the influence of online object demand data points;

When β=1: the influence of offline geographical environment demand data points is equal to the influence of online object demand data points.

7. The method according to claim 5, wherein the preset condition is: the distance between the aggregated data points after the update and the aggregated data points before the update is less than the error distance, or the number of cycles is greater than or equal to the set cycle maximum number of thresholds.

8. The method according to any one of claims 1-7, wherein the clustering algorithm adopts a k-means clustering algorithm.

9. A device for generating address information, comprising:

The acquisition module is used to acquire online object demand data and offline geographic environment data;

a computing module, configured to use a clustering algorithm to aggregate the online object demand data and the offline geographic environment data respectively to obtain initial aggregated data points;

A selection module, configured to determine final aggregated data points according to the initial aggregated data points.

10. The apparatus according to claim 9, wherein the selection module determines the final aggregated data points according to the initial aggregated data points, comprising:

11. The device according to claim 10, wherein the selection module calculates the distance between the online object demand data points and the initial aggregation data points to determine the classification of the online object demand data points, comprising:

In addition, the computing module calculates the distance between the offline geographic environment demand data point and the initial aggregated data point to determine the classification of the offline geographic environment demand data point, including:

12. The device according to claim 10, wherein the selection module obtains final aggregated data points according to the classification of online object demand data points and offline geographical environment demand data points, comprising:

13. The device according to claim 10, wherein the selection module iteratively updates the aggregated data points according to the classification of online object demand data points and offline geographical environment demand data points until a preset condition is reached, to obtain the data points. Finally gather data points.

14. The device according to claim 13, wherein the selection module iteratively updates the aggregated data points according to the classification of online object demand data points and the offline geographical environment demand data point classification, comprising:

Determine the aggregated data points for the next iteration:

in:

N∈{A,B,C,D…}

When β>1: the influence of offline geographical environment demand data points is greater than that of online object demand data points;

15. The device according to claim 13, wherein the preset condition is: the distance between the aggregated data points after the update and the aggregated data points before the update is less than the error distance, or the number of cycles is greater than or equal to the set cycle maximum number of thresholds.

16. The apparatus according to any one of claims 9-15, wherein the clustering algorithm adopts a k-means clustering algorithm.

17. An electronic device, comprising:

one or more processors;

storage means for storing one or more programs,

The one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-8.

18. A computer-readable medium on which a computer program is stored, characterized in that, when the program is executed by a processor, the method according to any one of claims 1-8 is implemented.