티스토리 툴바


한글 자소분리 JSP버젼으로 만들었습니다

 

<%@ page contentType="text/html;charset=ms949" %>


<%!
private static int []DefChoSung= new int[]{0x3131, 0x3132, 0x3134, 0x3137, 0x3138, 0x3139, 0x3141, 0x3142, 0x3143, 0x3145, 0x3146, 0x3147, 0x3148, 0x3149, 0x314a, 0x314b, 0x314c, 0x314d, 0x314e};
private static int []DefJungSung= new int[]{0x314f, 0x3150, 0x3151, 0x3152, 0x3153, 0x3154, 0x3155, 0x3156, 0x3157, 0x3158, 0x3159, 0x315a, 0x315b, 0x315c, 0x315d, 0x315e, 0x315f, 0x3160, 0x3161, 0x3162, 0x3163};
private static int []DefJongSung= new int[]{0,      0x3131, 0x3132, 0x3133, 0x3134, 0x3135, 0x3136, 0x3137, 0x3139, 0x313a, 0x313b, 0x313c, 0x313d, 0x313e, 0x313f, 0x3140, 0x3141, 0x3142, 0x3144, 0x3145, 0x3146, 0x3147, 0x3148, 0x314a, 0x314b, 0x314c, 0x314d, 0x314e};

 

public StringgetParsingHangle(String hangle) {
  
    if (hangle == null)
            return null;
       
    StringBuffer buffer = new StringBuffer();
    for (int i = 0; i < hangle.length(); i++) {
        char han = hangle.charAt(i);       
        int ihan = (int)han;
        int icho, ijung, ijong;
                   
        if (ihan >= 0xAC00 && ihan <= 0xD7A3) {
            ijong = ihan - 0xAC00;
            icho = ijong / (21*28);
            ijong = ijong % (21*28);
            ijung = ijong / 28;
            ijong = ijong % 28;

            buffer.append((char)(DefChoSung[icho]));
            buffer.append((char)(DefJungSung[ijung]));
            if (ijong > -1)
                buffer.append((char)DefJongSung[ijong]);
        } else
            buffer.append(han);
    }
       
    return buffer.toString();       
}
%>

 

<%
String s = "똠방각하";
out.println(s+"<br>");
out.print(getParsingHangle(s));
%>

 

요즘 유행하는 검색어 추천기능에 사용될 수 있습니다

 

"똠방각하"를 입력하면 다음과 같이 결과가 나타납니다

ㄸㅗㅁㅂㅏㅇㄱㅏㄱㅎㅏ

 

위 코드 단점

중성의 "ㅙ" 같은 이중 중성이나 "ㄶ"같은 이중 종성은 분리하지 못합니다

Posted by The.민군