请帮我翻译一段英文?

Mostdataminingalgorithmsrequirethesettingofmanyinputparameters.Twomaindangersofworkin... Most data mining algorithms require the setting of many input parameters. Two main dangers of working with parameter-laden algorithms are the following. First, incorrect settings may cause an algorithm to fail in finding the true patterns. Second, a perhaps more insidious problem is that the algorithm may report spurious patterns that do not really exist, or greatly overestimate the significance of the reported patterns. This is especially likely when the user fails to understand the role of parameters in the data mining process.
Data mining algorithms should have as few parameters as possible, ideally none. A parameter-free algorithm would limit our ability to impose our prejudices, expectations, and presumptions on the problem at hand, and would let the data itself speak to us. In this work, we show that recent results in bioinformatics and computational theory hold great promise for a parameter-free data-mining paradigm. The results are motivated by observations in Kolmogorov complexity theory. However, as a practical matter, they can be implemented using any off-the-shelf compression algorithm with the addition of just a dozen or so lines of code. We will show that this approach is competitive or superior to the state-of-the-art approaches in anomaly/interestingness detection, classification, and clustering with empirical tests on time series/DNA/text/video datasets
展开
derkaiseeer
2007-01-29 · TA获得超过617个赞
知道小有建树答主
回答量:464
采纳率:0%
帮助的人:457万
展开全部
多数数据挖掘算法需要设置许多输入参数. 两大危险同参数载货算法以下. 第一,正确设置可能导致失败找到一种真实的花纹. 第二,也许更为隐蔽的问题是,虚假报道模式算法可能不存在, 还是大大高估了报道模式. 这一点尤其当用户可能不了解角色参数的数据挖掘过程. 数据挖掘算法参数应该尽量少,最好没有. 参数签证算法将限制我们的能力把我们的成见,期待,就推定问题,另一方面 数据本身并让我们说话. 在这项工作中, 我们最近的结果显示,在生物信息理论和计算参数进行了大有希望免费数据挖掘范例. 观测结果是私心的Kolmogorov复杂的理论. 但是,作为一个实际问题, 他们可以用任何现成的压缩算法加上刚刚十几行代码. 我们将证明,这种做法是竞争或优于国家的最先进的办法异常/趣味性检测、分类、 聚类与实证试验和时序/脱氧核糖核酸/文/视频数据
推荐律师服务: 若未解决您的问题,请您详细描述您的问题,通过百度律临进行免费专业咨询

为你推荐:

下载百度知道APP,抢鲜体验
使用百度知道APP,立即抢鲜体验。你的手机镜头里或许有别人想知道的答案。
扫描二维码下载
×

类别

我们会通过消息、邮箱等方式尽快将举报结果通知您。

说明

0/200

提交
取消

辅 助

模 式