join:file is not in sorted order

问题描述

join:file is not in sorted order

linux下使用join关联两个文件
两文件1和2内容如下:
[c7@db1 shconfig]$ cat 1
1|1|目3|的|0
3|1|Test|dd|0
4|1|jjjj|dd|0
6|1|等发达|大水|0
7|1|3组| |0
10|1|sd|好的|0
11|2|恩呢| |2

[c7@db1 shconfig]$ cat 2
35|2|1|Jul 10 2014 4:03:52.000000PM|1|67.0|66.0|1|test|2|
37|1|1|Jul 10 2014 2:56:34.000000PM|26|200.0|30.0|1|dddd|0|
39|1|3|Jul 10 2014 2:57:03.000000PM|6|100.0|50.0|1|aaaaa|1|
40|1|3|Jul 10 2014 2:57:03.000000PM|6|60.0|10.0|1|aaaaa|1|
34|1|10|Jul 10 2014 1:23:01.000000PM|1|00.0|60.0|1|lllll|2|

希望经过关联文件1的第一列和文件2的第三列之后的结果是这样的:
1|1|目3|的|0|35|2|Jul 10 2014 4:03:52.000000PM|1|67.0|66.0|1|test|2|
1|1|目3|的|0|37|1|Jul 10 2014 2:56:34.000000PM|26|200.0|30.0|1|dddd|0|
3|1|Test|dd|0|39|1|Jul 10 2014 2:57:03.000000PM|6|100.0|50.0|1|aaaaa|1|
3|1|Test|dd|0|40|1|Jul 10 2014 2:57:03.000000PM|6|60.0|10.0|1|aaaaa|1|
10|1|sd|好的|0|34|1|Jul 10 2014 1:23:01.000000PM|1|00.0|60.0|1|lllll|2|

但是对文件1和2进行排序后以join关联结果如下:
[c7@db1 shconfig]$ sort -k1 -t'|' 1 > 3
[c7@db1 shconfig]$ sort -k3 -t'|' 2 > 4
[c7@db1 shconfig]$ join -j1 1 -j2 3 -t'|' 3 4
10|1|sd|好的|0|34|1|Jul 10 2014 1:23:01.000000PM|1|00.0|60.0|1|lllll|2|
join: file 1 is not in sorted order
3|1|Test|dd|0|39|1|Jul 10 2014 2:57:03.000000PM|6|100.0|50.0|1|aaaaa|1|
3|1|Test|dd|0|40|1|Jul 10 2014 2:57:03.000000PM|6|60.0|10.0|1|aaaaa|1|
附:[c7@db1 shconfig]$ cat 3
10|1|sd|好的|0
11|2|恩呢| |2
1|1|目3|的|0
3|1|Test|dd|0
4|1|jjjj|dd|0
6|1|等发达|大水|0
7|1|3组| |0
[c7@db1 shconfig]$ cat 4
34|1|10|Jul 10 2014 1:23:01.000000PM|1|00.0|60.0|1|lllll|2|
37|1|1|Jul 10 2014 2:56:34.000000PM|26|200.0|30.0|1|dddd|0|
35|2|1|Jul 10 2014 4:03:52.000000PM|1|67.0|66.0|1|test|2|
39|1|3|Jul 10 2014 2:57:03.000000PM|6|100.0|50.0|1|aaaaa|1|
40|1|3|Jul 10 2014 2:57:03.000000PM|6|60.0|10.0|1|aaaaa|1|

再用以数字排序后进行关联的方式:
[c7@db1 shconfig]$ sort -n -k1 -t'|' 1 > 5
[c7@db1 shconfig]$ sort -n -k3 -t'|' 2 > 6
[c7@db1 shconfig]$ join -j1 1 -j2 3 -t'|' 5 6
1|1|目3|的|0|35|2|Jul 10 2014 4:03:52.000000PM|1|67.0|66.0|1|test|2|
1|1|目3|的|0|37|1|Jul 10 2014 2:56:34.000000PM|26|200.0|30.0|1|dddd|0|
3|1|Test|dd|0|39|1|Jul 10 2014 2:57:03.000000PM|6|100.0|50.0|1|aaaaa|1|
3|1|Test|dd|0|40|1|Jul 10 2014 2:57:03.000000PM|6|60.0|10.0|1|aaaaa|1|
join: file 1 is not in sorted order

附:
[c7@db1 shconfig]$ cat 5
1|1|目3|的|0
3|1|Test|dd|0
4|1|jjjj|dd|0
6|1|等发达|大水|0
7|1|3组| |0
10|1|sd|好的|0
11|2|恩呢| |2
[c7@db1 shconfig]$ cat 6
35|2|1|Jul 10 2014 4:03:52.000000PM|1|67.0|66.0|1|test|2|
37|1|1|Jul 10 2014 2:56:34.000000PM|26|200.0|30.0|1|dddd|0|
39|1|3|Jul 10 2014 2:57:03.000000PM|6|100.0|50.0|1|aaaaa|1|
40|1|3|Jul 10 2014 2:57:03.000000PM|6|60.0|10.0|1|aaaaa|1|
34|1|10|Jul 10 2014 1:23:01.000000PM|1|00.0|60.0|1|lllll|2|

实在搞不懂为什么,如果大家有解决办法请务必告知,谢谢。。

附:版本及LANG
[c7@db1 shconfig]$ sort --version
sort (GNU coreutils) 8.4
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by Mike Haertel and Paul Eggert.
[c7@db1 shconfig]$ join --version
join (GNU coreutils) 8.4
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by Mike Haertel.
[c7@db1 shconfig]$ echo $LANG
C

时间: 2024-09-07 01:18:47

join:file is not in sorted order的相关文章

Reading files in JavaScript using the File APIs

Reading files in JavaScript using the File APIs By Eric Bidelman Published: June 18th, 2010Comments: 273 Introduction HTML5 finally provides a standard way to interact with local files, via the File APIspecification. As example of its capabilities, t

文件系统函数库:file

file (PHP3 , PHP4) file ---&http://www.aliyun.com/zixun/aggregation/37954.html">nbsp; 读取文件全部内容到数组中 语法 : array file (string filename [, int use_include_path]) 说明 : 与readfile( )相同,不同处在于此函数是传回数组,各个数组的元素相当于文件中的行数.新行(newline)依旧是附属的. 如果你想要在include_p

PHP4用户手册:函数-file

TABLE border=0 cellPadding=0 cellSpacing=0 height="100%" width="100%">  (PHP 3, PHP 4 >= 4.0.0)file -- 读取整个文件到数组描述 array file (string filename [, int use_include_path]) 除了file()返回的是数组,其它的与readfile()一样.数组的每个元素对应文件的每一行,而带有换行符.注意:在结

MySQL的几个参数

back_log : 操作系统保持监听的队列,这个是在mysql 进行thread 进行连接之前的操作 如果你有很多的连接数 并且出现 connection refused 的错误提示,增加这个值肯定没错. skip-networking : 这种是不在监听 TCP/IP 连接.这个时候只能通过socket 或者 named pip 连接 max_connect_errors: 允许每个host 连接出错的次数,如果达到该值,该host就不能在进行连接除非使用 flush hosts 命令或者重

Hbase简介和基本用法

一.简介 history  started by chad walters and jim 2006.11 G release paper on BigTable 2007.2 inital HBase prototype created as Hadoop contrib 2007.10 First useable Hbase 2008.1 Hadoop become Apache top-level project and Hbase becomes subproject 2008.10 H

MySQL 5.6.16/5.5.36 发布

MySQL各产品线更新.5.6.16/5.5.36 2014-01-31之前版本是2013-09-20的5.6.15/5.5.35,主要是Bug修正,cmake支持了-DTMPDIR,增强了ALTER_TABLE.5.1系列还是5.1.73. 完全改进: Changes in MySQL 5.6.16 (2014-01-31) A known limitation of this release: Note Building MySQL from source on Windows using

Cassandra - A Decentralized Structured Storage System

http://www.cs.cornell.edu/projects/ladis2009/papers/lakshman-ladis2009.pdf, 英文 http://www.dbthink.com/?p=372, 中文 对Cassandra并没有深入研究, 在data server上copy了bigtable, 而在分布式nodes管理上copy了Dynamo的去中心化的架构, 可以认为是去中心化设计的Bigtable. 当然他在copy bigtable的数据模型和SSTable机制的同

MySQL 5.5.49 大内存优化配置文件优化详解_Mysql

一.配置文件说明 my-small.cnf my-medium.cnf my-large.cnf my-huge.cnf my-innodb-heavy-4G.cnf  二.详解 my-innodb-heavy-4G.cnf三.配置文件优化 注:环境说明,CentO5.5 x86_64+MySQL-5.5.32 相关软件下载:http://yunpan.cn/QtaCuLHLRKzRq 一.配置文件说明 Mysql-5.5.49是Mysql5.5系列中最后一个版本,也是最后一个有配置文件的版本,

【OH】Glossary Oracle词汇表(上)

Glossary [OH]Glossary Oracle词汇表(上) Oracle? Multimedia DICOM Developer's Guide 11g Release 2 (11.2) E10778-03 Glossary ● anonymity document An XML document that specifies the set of attributes to be made anonymous, and defines the actions required to