邮编库的邮政编码数据处理程序
来自Jamesqi
--James Qi 2010年3月2日 (二) 10:03 (CST)
过年前几天从网上下载了170万条邮政编码数据库,让同事帮忙导入SQL Server处理都因为数据量太大而无法执行。
后来自己摸索这用Word中的老VB编了点字符处理的程序,好歹算是可用了,下面把程序记录下来,留着备用。
按邮政编码汇总程序
2010-2-12
说明:先将下载的文件解压、从Access中导出文本文件、按照邮政编码进行排序,再用下面的程序来处理,1669109条原始数据变为35051条按照邮政编码列出的数据。
Sub test() Const ForReading = 1, ForWriting = 2, ForAppending = 8 Dim fs, f1, f2, l, s, e, c, w, i Set fs = CreateObject("Scripting.FileSystemObject") Set f1 = fs.OpenTextFile("c:\test1.txt", ForReading) Set f2 = fs.OpenTextFile("c:\test2.txt", ForWriting, True) l = f1.readline postcode = Mid(l, 11, 6) areacode = Mid(l, 18, 6) area = Mid(l, 25, InStr(25, l, "|") - 25) Address = Mid(l, InStr(25, l, "|") + 1, Len(l) - InStr(25, l, "|") - 2) Count = 1 w = "<title>" + postcode + "</title>" + Chr(13) + Chr(10) + "{{zipcode" + Chr(13) + Chr(10) + "|邮编=" + postcode + Chr(13) + Chr(10) + "|行政代码" + Trim$(Str$(Count)) + "=" + areacode + "|地区" + Trim$(Str$(Count)) + "=" + area + "|地址" + Trim$(Str$(Count)) + "=" + Address Do While f1.AtEndOfStream <> True Do While f1.AtEndOfStream <> True l2 = f1.readline postcode2 = Mid(l2, 11, 6) areacode2 = Mid(l2, 18, 6) area2 = Mid(l2, 25, InStr(25, l2, "|") - 25) address2 = Mid(l2, InStr(25, l2, "|") + 1, Len(l2) - InStr(25, l2, "|") - 2) If postcode = postcode2 Then Count = Count + 1 w = w + Chr(13) + Chr(10) + "|行政代码" + Trim$(Str$(Count)) + "=" + areacode2 + "|地区" + Trim$(Str$(Count)) + "=" + area2 + "|地址" + Trim$(Str$(Count)) + "=" + address2 l = l2 postcode = postcode2 areacode = areacode2 area = area2 Address = address2 Else l = l2 postcode = postcode2 areacode = areacode2 area = area2 Address = address2 total = Count Count = 1 Exit Do End If Loop w = w + Chr(13) + Chr(10) + "|数量=" + Trim$(Str$(total)) + Chr(13) + Chr(10) + "}}" f2.writeline (w) w = "<title>" + postcode + "</title>" + Chr(13) + Chr(10) + "{{zipcode" + Chr(13) + Chr(10) + "|邮编=" + postcode + Chr(13) + Chr(10) + "|行政代码" + Trim$(Str$(Count)) + "=" + areacode + "|地区" + Trim$(Str$(Count)) + "=" + area + "|地址" + Trim$(Str$(Count)) + "=" + Address Loop f1.Close f2.Close End Sub
原始数据举例:
{{zipcode|010010|150102|内蒙古自治区呼和浩特市新城区|北马神庙街}} {{zipcode|010010|150102|内蒙古自治区呼和浩特市新城区|北垣东街}} {{zipcode|010010|150102|内蒙古自治区呼和浩特市新城区|北垣街}} {{zipcode|010010|150102|内蒙古自治区呼和浩特市新城区|北垣西街}} {{zipcode|010010|150102|内蒙古自治区呼和浩特市新城区|财神庙街}} {{zipcode|010010|150102|内蒙古自治区呼和浩特市新城区|东风路}} {{zipcode|010010|150102|内蒙古自治区呼和浩特市新城区|东护城河路}} {{zipcode|010010|150102|内蒙古自治区呼和浩特市新城区|东落凤街}} {{zipcode|010010|150102|内蒙古自治区呼和浩特市新城区|东影北街}} {{zipcode|010010|150102|内蒙古自治区呼和浩特市新城区|二眼井巷}} ...... {{zipcode|010010|150105|内蒙古自治区呼和浩特市赛罕区|昭乌达路89-89(单号)}} {{zipcode|010010|150105|内蒙古自治区呼和浩特市赛罕区|铸管街}}
010010的457条处理成:
<title>003105</title> {{zipcode |邮编=003105 |行政代码1=140728|地区1=山西省晋中市平遥县|地址1=朱坑乡贾家庄 |数量=1 }} <title>010010</title> {{zipcode |邮编=010010 |行政代码1=150102|地区1=内蒙古自治区呼和浩特市新城区|地址1=北马神庙街 |行政代码2=150102|地区2=内蒙古自治区呼和浩特市新城区|地址2=北垣东街 |行政代码3=150102|地区3=内蒙古自治区呼和浩特市新城区|地址3=北垣街 |行政代码4=150102|地区4=内蒙古自治区呼和浩特市新城区|地址4=北垣西街 |行政代码5=150102|地区5=内蒙古自治区呼和浩特市新城区|地址5=财神庙街 |行政代码6=150102|地区6=内蒙古自治区呼和浩特市新城区|地址6=东风路 |行政代码7=150102|地区7=内蒙古自治区呼和浩特市新城区|地址7=东护城河路 |行政代码8=150102|地区8=内蒙古自治区呼和浩特市新城区|地址8=东落凤街 |行政代码9=150102|地区9=内蒙古自治区呼和浩特市新城区|地址9=东影北街 |行政代码10=150102|地区10=内蒙古自治区呼和浩特市新城区|地址10=二眼井巷 ...... |行政代码456=150105|地区456=内蒙古自治区呼和浩特市赛罕区|地址456=铸管街 |行政代码457=150105|地区457=内蒙古自治区呼和浩特市赛罕区|地址457=敕勒川大街 |数量=457 }}
按地区名称汇总程序
2010-3-2
说明:一个程序先将原始数据中的邮政编码和地区换位,然后用文本处理器排序,再用一个程序来处理成最后需要的按照地区名称为标题的页面数据。
第一个程序:
Sub test201003302() Const ForReading = 1, ForWriting = 2, ForAppending = 8 Dim fs, f1, f2, l, s, e, c, w, i Set fs = CreateObject("Scripting.FileSystemObject") Set f1 = fs.OpenTextFile("c:\test1.txt", ForReading) Set f2 = fs.OpenTextFile("c:\test3.txt", ForWriting, True) Do While f1.AtEndOfStream <> True l = f1.readline postcode = Mid(l, 11, 6) areacode = Mid(l, 18, 6) area = Mid(l, 25, InStr(25, l, "|") - 25) Address = Mid(l, InStr(25, l, "|") + 1, Len(l) - InStr(25, l, "|") - 2) w = area + l f2.writeline (w) Loop f1.Close f2.Close End Sub
第二个程序:
Sub test201033() Const ForReading = 1, ForWriting = 2, ForAppending = 8 Dim fs, f1, f2, l, s, e, c, w, i Set fs = CreateObject("Scripting.FileSystemObject") Set f1 = fs.OpenTextFile("c:\test3.txt", ForReading) Set f2 = fs.OpenTextFile("c:\test4.txt", ForWriting, True) l = f1.readline Start = InStr(1, l, "{") Start = Start - 1 postcode = Mid(l, Start + 11, 6) areacode = Mid(l, Start + 18, 6) area = Mid(l, Start + 25, InStr(Start + 25, l, "|") - Start - 25) Address = Mid(l, InStr(Start + 25, l, "|") + 1, Len(l) - InStr(Start + 25, l, "|") - 2) count1 = 0 w1 = "<title>" + area + "</title>" f2.writeline (w1) w = "{{zipcode" f2.writeline (w) w = "|邮编=" + postcode f2.writeline (w) w = "|行政代码=" + areacode f2.writeline (w) w = "|地区=" + area f2.writeline (w) Count = 0 Do While f1.AtEndOfStream <> True l2 = f1.readline start2 = InStr(1, l2, "{") start2 = start2 - 1 postcode2 = Mid(l2, start2 + 11, 6) areacode2 = Mid(l2, start2 + 18, 6) area2 = Mid(l2, start2 + 25, InStr(start2 + 25, l2, "|") - start2 - 25) Address2 = Mid(l2, InStr(start2 + 25, l2, "|") + 1, Len(l2) - InStr(start2 + 25, l2, "|") - 2) If area = area2 Then If postcode = postcode2 Then Count = Count + 1 w = "|地址" + Trim$(Str$(Count)) + "=" + Address f2.writeline (w) l = l2 postcode = postcode2 areacode = areacode2 area = area2 Address = Address2 Else Count = Count + 1 count1 = count1 + 1 w = "|地址" + Trim$(Str$(Count)) + "=" + Address f2.writeline (w) w = "|数量=" + Trim$(Str$(Count)) f2.writeline (w) w = "|顺序=" + Trim$(Str$(count1)) f2.writeline (w) w = "}}" f2.writeline (w) Count = 0 w = "{{zipcode" f2.writeline (w) w = "|邮编=" + postcode f2.writeline (w) w = "|行政代码=" + areacode f2.writeline (w) w = "|地区=" + area f2.writeline (w) l = l2 postcode = postcode2 areacode = areacode2 area = area2 Address = Address2 End If Else Count = Count + 1 w = "|地址" + Trim$(Str$(Count)) + "=" + Address f2.writeline (w) w = "|数量=" + Trim$(Str$(Count)) f2.writeline (w) Count = 0 count1 = count1 + 1 w = "|顺序=" + Trim$(Str$(count1)) f2.writeline (w) w = "}}" f2.writeline (w) count1 = 0 w = "<title>" + area2 + "</title>" f2.writeline (w) w = "{{zipcode" f2.writeline (w) w = "|邮编=" + postcode2 f2.writeline (w) w = "|行政代码=" + areacode2 f2.writeline (w) w = "|地区=" + area2 f2.writeline (w) l = l2 postcode = postcode2 areacode = areacode2 area = area2 Address = Address2 total = Count End If Loop f1.Close f2.Close End Sub
程序被我反复改得乱七八糟,算是勉强能用。
标签:邮编库、邮政编码、程序。 |
相关内容:
|