Code for the Earth Movers Distance (EMD)
Introduction
This is an implementation of the Earth Movers Distance, as described in [1]. The EMD computes the distance between two distributions, which are represented by signatures. The signatures are sets of weighted features that capture the distributions. The features can be of any type and in any number of dimensions, and are defined by the user.
The EMD is defined as the minimum amount of work needed to change one signature into the other. The notion of "work" is based on the user-defined ground distance which is the distance between two features. The size of the two signatures can be different. Also, the sum of weights of one signature can be different than the sum of weights of the other (partial match). Because of this, the EMD is normalized by the smaller sum.
The code is implemented in C, and is based on the solution for the Transportation problem as described in [2]
Please let me know of any bugs you find, or any questions, comments, suggestions, and criticisms you have. If you find this code useful for your work, I would like very much to hear from you. Once you do, I'll inform you of any improvements, etc. Also, an acknowledgment in any publication describing work that uses this code would be greatly appreciated.
Usage
To use the code, perform the following steps:
download the files emd.h and emd.c (check the log of changes for latest changes).
In emd.h, modify the line
typedef int feature_t;
to reflect your feature data type. Structures may be used too, for example
typedef struct {
int X,Y,Z;
} feature_t;
To compute an EMD, call:
float emd(signature_t *Signature1, signature_t *Signature2,
float (*func)(feature_t *, feature_t *),
flow_t *Flow, int *FlowSize);
where
Signature1, Signature2:
Pointers to the two signatures that their distance we want to compute.
Dist:
Pointer to the ground distance function. i.e. the function that computes the distance between two features.
Flow:
(Optional) Pointer to a vector of flow_t (defined in emd.h) where the resulting flow will be stored. Flow must have n1+n2-1 elements, where n1 and n2 are the sizes of the two signatures respectively. If NULL, the flow is not returned.
FlowSize:
(Optional) In case Flow is not NULL, FlowSize should point to a integer where the number of flow elements (always less or equal to n1+n2-1) will be written.
Compile emd.c and link it to your code.
The signature data type signature_t is defined in emd.h as:
typedef struct
{
int n; /* Number of features in the distribution */
feature_t *Features; /* Pointer to the features vector */
float *Weights; /* Pointer to the weights of the features */
} signature_t;
The feature data type feature_t is defined in emd.h and should be modified by the user to reflect his feature type.
In special cases, the user might want to change some of the values in the definitions in emd.h:
#define MAX_SIG_SIZE 100
The maximum allowed number of features in a signature
#define MAX_ITERATIONS 100
Maximum number of iterations. For ordinary problems, 100 should be more than enough.
#define INFINITY 1e20
INFINITY should be much larger than any other value in the problem
#define EPSILON 1e-6
EPSILON determines the accuracy of the solution.
Examples
example1.c
In this example the features are in three-dimensional space, and the ground distance is the Euclidean distance.
To try this example, you need to modify feature_t in emd.h to be
typedef struct { int X,Y,Z; } feature_t;
example2.c
Here, instead of providing a function to compute the ground distances, they are given in a predefined matrix. This is done by defining feature_t as int (the default), and setting the features in each signature to have consecutive numbers. The ground distance function uses these numbers as indexes to the predefined cost matrix.
The resulting flow is returned from the emd function by passing as the last parameters a vector of flow_t, and a pointer to int where the actual number of flow elements will be written.
Also, in this example the total weights of the two signatures are different. The first signature has a total weight of 1 while the second signature has a total weight of 0.9.
Log of changes
Date Decription
05/10/98 Changed from absolute error check to relative error check + Fixed a bug that caused the "Unexpected error in findBasicVariables" error message (thanks Mark Ruzon)
03/04/98 Fixed a bug that sometime caused a crash when the second signature had only one feature (thanks Mark Ruzon)
03/31/98 Fixed a bug in the case where *both* signatures have only one feature (thanks Rajesh Batra)
References
[1] Y. Rubner, C. Tomasi, and L. J. Guibas. A Metric for Distributions with Applications to Image Databases. Proceedings of the 1998 IEEE International Conference on Computer Vision, Bombay, India, January 1998, pp. 59-66.
[2] F. S. Hillier and G. J. Lieberman. Introduction to Mathematical Programming McGraw-Hill, 1990.

ChengWong
- 粉丝: 0
最新资源
- 毕设&课设:大二课程设计 - 智慧班级管理系统 - 前端.zip
- 毕设&课设:电子科技大学 软件技术基础综合课程设计项目整合-1、最优物流路线计算实验 2、多进程多用户文件一致.zip
- 毕设&课设:杜亮的研究生毕业也设计-基于 SOA 的智慧社区管理系统的研究与实现.zip
- 毕设&课设:该作品是2021年中国高校计算机大赛-微信小程序应用开发大赛的作品,作品为智慧校园类型的微信小程序。.zip
- 毕设&课设:广州市城市规划勘测设计研究院智慧规划师系统.zip
- 毕设&课设:基于ISO IEC 14443 Type A的智慧车库门禁控制系统设计与实现.zip
- 毕设&课设:基于Java+Springboot+Vue校园选课小程序-毕业设计,本系统分微信小程序和管理后台两部.zip
- 毕设&课设:基于Java+Springboot+Vue教学培训小程序-毕业设计,本系统分微信小程序和管理后台两部.zip
- 毕设&课设:基于Java+SpringBoot+Vue充电桩管理系统-毕业设计,本系统前台和后台管理两部分,项目.zip
- 毕设&课设:基于SpringBoot的智慧社区管理系统,计算机毕业设计,毕设,Java毕业设计,SpringBo.zip
- 毕设&课设:基于SSM+Jquery+Bootstrap+Maven+JSP的智慧物流前台,该项目荣获北京联合大.zip
- 毕设&课设:基于SpringBoot和VUE技术的智慧生活商城系统设计与实现,计算机毕业设计,毕设,Java毕业.zip
- 毕设&课设:基于web的智慧社区设计与实现.zip
- 毕设&课设:基于web的智慧养老平台,计算机毕业设计,毕设,Java毕业设计,SpringBoot,SSM,小程.zip
- 毕设&课设:基于SSM智慧养老院服务系统设计毕业源码案例设计.zip
- 毕设&课设:基于安卓AndroidStudio智慧医疗医院预约挂号app设计毕业源码案例设计.zip
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈


