Motivation

point incomeで数独のアプリ300問解いたらクリアの案件あったが自分で解くのがめんどくさかったのでpythonで自動で解かせた。

毎日懸賞ナンプレ

BlueApps Games
ゲーム
無料

apps.apple.com

流れ

PCでiPhoneの画面共有し盤面をスクリーンショット。
スクリーンショットから数字を認識しテキスト形式へ。
アルゴリズムで解き結果を表示。

1. PCでiPhoneの画面を共有し盤面をスクリーンショット

LetsViewでwindowsパソコンにiPhone画面を写すようにした。別のソフトでApowerMirrorというものが高画質で提供されてるらしかったけど、レビューに否定的な意見が多くてやめた。

f:id:busongames:20201021205657p:plain — こんな感じでiPhoneの画面がPCに表示される。画質粗し。

2. スクリーンショットから数字を認識しテキスト形式へ

OCR（Optical Character Recognition/Reader)って技術で画像をテキストにできるらしい。今回はtesseractとpyorcを使ってpythonで画像内の数字を読み取るようにした。

gammasoft.jp

スクショを直接解析しようとするとうまくいかなかったので、画像から数字のある部分をcropして解析しやすい形に直してからOCRすることにした。

windowsのスクリーンショットは最初はクリップボードに保持されるので、PillowのImageGrab.grabclipboard()で画像を持ってくる。
画像内の数字以外の余計な線などをなくす。いろいろいじって、今回のアプリの色使いならHSV形式にしたときの明度が130より小さいものにマスクをかけたら数字だけうまく出てきた。
cv2.findContours()で数字の輪郭を抽出し画像から数字の書いてある部分をcropできるようにする。その後、行ごとに検出した数字をOCRで認識。

だいたい以上の流れで画像からテキストデータにできる。汚いコードは以下。

# クリップボードから画像取得
    origin_img = ImageGrab.grabclipboard()

    if isinstance(origin_img, Image.Image):
        w_after, h_after = 360, 360
        origin_img = origin_img.resize((w_after, h_after))
        origin_img_np = np.array(origin_img)
        origin_img_hsv = cv2.cvtColor(origin_img_np, cv2.COLOR_RGB2HSV)
        # V > 130 を255に
        mask = (origin_img_hsv[:, :, 2] > 130) 
        hsv_filtered = np.copy(origin_img_hsv)
        hsv_filtered[:, :, 2] = np.where(mask, 255, origin_img_hsv[:,:,2])
        gray_filterd = hsv_filtered[:,:,2]
        gray_filterd = 255-gray_filterd

        # 数字の輪郭を検出
        thresh = cv2.adaptiveThreshold(gray_filterd, 255, 1, 1, 11, 2)
        contours = cv2.findContours(thresh, cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)[0]

        gray_filterd = 255-gray_filterd
        rgb_filtered = cv2.cvtColor(gray_filterd, cv2.COLOR_GRAY2RGB)

        boxes = np.zeros((9, 9, 4)) 

        # 矩形があるとこを保存
        for cnt in contours:
            x, y, w, h = cv2.boundingRect(cnt)
            r = y//(h_after//9)
            c = x//(w_after//9)
            if w*h > boxes[r][c][2]*boxes[r][c][3]:    
                boxes[r][c] = [x, y, w, h]

        boxes = boxes.astype(np.int)
        tools = pyocr.get_available_tools()
        tool = tools[0]
        builder = pyocr.builders.DigitBuilder(tesseract_layout=6)

        txt = ''

        def calc(box):
            x, y, w, h = box
            cropped_img = rgb_filtered[y:y+h, x:x+w]
            cropped_img_pil = Image.fromarray(cropped_img)
            return cropped_img_pil.resize((w_after//9, h_after//9))

        # cropした数字画像を行ごとにつなげてOCR
        for r in range(9):
            # crop処理
            ds = [calc(boxes[r][c]) for c in range(9) if np.sum(boxes[r][c])>0]
            # 連結処理
            tmp = Image.new('RGB', (w_after//9*len(ds), h_after//9))
            for i, d in enumerate(ds):
                tmp.paste(d, (w_after//9*i, 0))
            
            w_t, h_t = tmp.size
            tmp = tmp.resize((w_t//2, h_t//2))
            
            # OCR
            txt_i = tool.image_to_string(tmp, lang='eng', builder=builder)
            cnt = 0
            for c in range(9):
                if np.sum(boxes[r][c])>0:
                    txt += txt_i[cnt]
                    cnt += 1
                # 数字以外はピリオド
                else:
                    txt += '.'

f:id:busongames:20201021212514p:plain — スクショ

これが

f:id:busongames:20201021212534p:plain — 数字のみになるように画像をいじってから、矩形で検出。赤い四角形が検出結果。

こうなって

.6..84..7..23..5..48..2..61..49.28..95.87..32.28..36.95..24.79..47.91..32..7..14.

こうなる。

3. アルゴリズムで解き結果を表示。

省略。調べたらいろいろ出てくる。

結果

今のところ一秒で答えが出る。

. 6 . |. 8 4 |. . 7 
. . 2 |3 . . |5 . .
4 8 . |. 2 . |. 6 1
------+------+------
. . 4 |9 . 2 |8 . .
9 5 . |8 7 . |. 3 2
. 2 8 |. . 3 |6 . 9
------+------+------
5 . . |2 4 . |7 9 .
. 4 7 |. 9 1 |. . 3
2 . . |7 . . |1 4 .

3 6 5 |1 8 4 |9 2 7
1 7 2 |3 6 9 |5 8 4
4 8 9 |5 2 7 |3 6 1
------+------+------
6 3 4 |9 1 2 |8 7 5
9 5 1 |8 7 6 |4 3 2
7 2 8 |4 5 3 |6 1 9
------+------+------
5 1 3 |2 4 8 |7 9 6
8 4 7 |6 9 1 |2 5 3
2 9 6 |7 3 5 |1 4 8

あとはこれをiPhoneのアプリに打ち込めば終了。いい感じ!

肉寿司食べたい。

日々の記録

数独アプリ自動で解かせた

Motivation

流れ

1. PCでiPhoneの画面を共有し盤面をスクリーンショット

2. スクリーンショットから数字を認識しテキスト形式へ

3. アルゴリズムで解き結果を表示。

結果