python+Selenium

一、前置确认（关键）

Chrome 49.0.2623.112 是非常老旧的版本，请先确认你的"浏览器系统"是：

纯Chrome 49（2016年停止更新）
国产双核浏览器的兼容模式（360、搜狗、QQ浏览器等，实际是IE内核伪装成Chrome）
定制化浏览器外壳（政府/企业常见）

判断方法：在浏览器地址栏输入 chrome://version 看是否能打开。

如果能打开 → 使用 Selenium + ChromeDriver 2.22
如果不能/显示IE版本 → 使用 Selenium + IE Driver 或 Pywinauto

二、离线依赖包下载清单

由于Python 3.7.9兼容性问题，必须使用以下特定版本：

表格

复制

包名	版本	下载地址（阿里/清华镜像）	文件大小
selenium	3.141.0	https://mirrors.aliyun.com/pypi/packages/80/d6/4294f0b4bce4de0abf13e171c47c3b3858e183328b7f46916f43f8c15e79/selenium-3.141.0-py2.py3-none-any.whl	~900KB
urllib3	1.25.11	https://mirrors.aliyun.com/pypi/packages/56/aa/4ef5aa67a9a62505db124a5cb5262332f1dbdd5ce1df8f1f25f622520e7b/urllib3-1.25.11-py2.py3-none-any.whl	~130KB
idna	2.10	https://mirrors.aliyun.com/pypi/packages/a2/38/928ddce2273eaa564f6f50de919327bf3a00f091b5baba8dfa9460f3a8a8/idna-2.10-py2.py3-none-any.whl	~58KB
chardet	3.0.4	https://mirrors.aliyun.com/pypi/packages/bc/a9/01ffebfb562e4274b6487b4bb1ddec7ca55ec7510b22e4c51f14098443b8/chardet-3.0.4-py2.py3-none-any.whl	~133KB
certifi	2020.12.5	https://mirrors.aliyun.com/pypi/packages/5e/a0/5f06e1e1d463903cf0c0eebeb751791119ed7a4b48d09dc198f61e29cb0b/certifi-2020.12.5-py2.py3-none-any.whl	~145KB

驱动下载（关键）

ChromeDriver 2.22（对应Chrome 49）：
https://chromedriver.storage.googleapis.com/2.22/chromedriver_win32.zip
如果实际是IE内核，下载 IEDriverServer 3.141.0：
https://selenium-release.storage.googleapis.com/3.141/IEDriverServer_Win32_3.141.0.zip

三、内网离线安装步骤

1. 在外网准备（U盘拷贝到内网）

bash

复制

# 在外网电脑执行，创建依赖包文件夹
mkdir d:\offline_packages
cd d:\offline_packages

# 下载上述所有whl文件和chromedriver
# 并将 python 3.7.9 安装包（python-3.7.9-amd64.exe）一并放入U盘

2. 内网安装Python

安装路径建议：C:\Python37
必须勾选：Add Python to PATH 和 Install pip

3. 内网安装依赖

将whl文件放入 D:\offline_packages，执行：

cmd

复制

cd D:\offline_packages
pip install --no-index --find-links=. selenium-3.141.0-py2.py3-none-any.whl
pip install --no-index --find-links=. urllib3-1.25.11-py2.py3-none-any.whl
pip install --no-index --find-links=. idna-2.10-py2.py3-none-any.whl
pip install --no-index --find-links=. chardet-3.0.4-py2.py3-none-any.whl
pip install --no-index --find-links=. certifi-2020.12.5-py2.py3-none-any.whl

4. 驱动部署

解压 chromedriver_win32.zip，将 chromedriver.exe 放入：

C:\Python37\Scripts\（推荐，已在PATH中）
或你的脚本同级目录

四、内网调试方法论（重点）

由于我无法看到你的页面，你需要通过以下方法提取页面DNA给我，或直接用于定位元素：

步骤1：获取页面结构（手动侦查）

打开你的走访系统，进入第一个房屋的人员列表页面
按 F12 打开开发者工具
按 Ctrl+Shift+C 选择"人员走访"按钮
在Elements面板中，右键该元素 → Copy → Copy XPath 和 Copy selector

步骤2：保存样本分析（离线调试关键）

在Console面板执行以下JS，保存返回的文本（用于分析数据结构）：

JavaScript

复制

// 获取所有房屋姓名列表（首页）
JSON.stringify(Array.from(document.querySelectorAll('.house-name, [onclick*="house"], .grid-item span')).map(e => ({text: e.innerText, onclick: e.getAttribute('onclick')})))

// 获取房屋内所有人员（房主页）
JSON.stringify(Array.from(document.querySelectorAll('tr, .member-item, [name*="visit"]')).map((row, i) => ({index: i, html: row.outerHTML.substring(0,200)})))

步骤3：截图调试代码（用于排查）

Python

复制

driver.save_screenshot('debug.png')
driver.page_source  # 保存HTML源码

五、自动化代码框架（需你填充定位符）

Python

复制

import time
import json
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException, NoSuchElementException

class VisitAutomation:
    def __init__(self):
        # Chrome 49配置（关键）
        options = webdriver.ChromeOptions()
        options.add_argument('--ignore-certificate-errors')
        options.add_argument('--allow-running-insecure-content')
        # 老版本Chrome可能需要关闭GPU加速
        options.add_argument('--disable-gpu')
        options.add_argument('--no-sandbox')
        
        self.driver = webdriver.Chrome(
            executable_path=r'C:\Python37\Scripts\chromedriver.exe',
            options=options
        )
        self.driver.implicitly_wait(10)
        self.wait = WebDriverWait(self.driver, 10)
        
    def start(self, entry_url):
        """入口：手动登录后的首页URL"""
        self.driver.get(entry_url)
        input("请手动登录系统，进入房屋列表首页后按回车继续...")
        self.process_all_houses()
    
    def process_all_houses(self):
        """遍历所有房屋"""
        while True:
            # 获取当前页所有住户链接（需要根据实际className修改）
            houses = self.driver.find_elements(By.XPATH, "//*[contains(text(), '户') or contains(@onclick, 'house')]")
            
            if not houses:
                print("未找到房屋，请检查页面结构")
                break
                
            print(f"发现 {len(houses)} 户待处理")
            
            for i in range(len(houses)):
                try:
                    # 重新获取元素（防止StaleElement）
                    houses = self.driver.find_elements(By.XPATH, "//*[contains(text(), '户') or contains(@onclick, 'house')]")
                    house_name = houses[i].text
                    print(f"\n正在处理: {house_name}")
                    
                    # 点击进入房主页
                    houses[i].click()
                    time.sleep(2)  # 老系统可能需要等待加载
                    
                    self.process_current_house()
                    
                    # 返回列表页
                    self.back_to_list()
                    
                except Exception as e:
                    print(f"处理房屋 {i} 出错: {str(e)}")
                    self.driver.save_screenshot(f'error_house_{i}.png')
                    
            # 检查是否有下一页
            if not self.has_next_page():
                break
            self.go_next_page()
    
    def process_current_house(self):
        """处理单个房屋内的所有人员"""
        while True:
            # 获取人员列表（右侧有"人员走访"按钮的行）
            # 【你需要根据Copy XPath结果修改这里】
            persons = self.driver.find_elements(By.XPATH, "//tr[contains(., '人员走访')] | //div[contains(., '走访')]")
            
            if not persons:
                print("  本房屋无人员或已处理完毕")
                break
                
            # 逐个走访
            for j in range(len(persons)):
                try:
                    # 重新定位（页面可能刷新）
                    visit_btn = self.driver.find_elements(By.XPATH, "//*[contains(text(), '人员走访')]")[j]
                    person_name = visit_btn.find_element(By.XPATH, "./preceding-sibling::*[1]").text  # 假设按钮左侧是姓名
                    
                    print(f"  正在走访: {person_name}")
                    visit_btn.click()
                    time.sleep(2)
                    
                    self.fill_visit_form("已走访")
                    self.save_and_return()
                    
                except Exception as e:
                    print(f"  走访人员 {j} 失败: {str(e)}")
            
            # 检查是否还有未走访人员（某种完成标记）
            if not self.has_unvisited_persons():
                break
    
    def fill_visit_form(self, content):
        """填写走访表单"""
        try:
            # 查找内容输入框（可能是textarea或input）
            # 【根据实际页面修改：可能是name='content', id='visitRemark'等】
            textarea = self.wait.until(
                EC.presence_of_element_located((By.TAG_NAME, "textarea"))
            )
            textarea.clear()
            textarea.send_keys(content)
            print(f"    已填写: {content}")
        except TimeoutException:
            print("    警告：未找到走访内容输入框")
            raise
    
    def save_and_return(self):
        """保存并返回"""
        try:
            # 点击保存（根据实际按钮文字修改）
            save_btn = self.driver.find_element(By.XPATH, "//*[contains(text(), '保存') or contains(text(), '提交')]")
            save_btn.click()
            time.sleep(1.5)  # 老系统保存慢
            
            # 处理可能的alert弹窗
            try:
                alert = self.driver.switch_to.alert
                alert.accept()
                print("    已确认提示框")
            except:
                pass
            
            # 返回房屋（"返回房屋"按钮）
            back_btn = self.driver.find_element(By.XPATH, "//*[contains(text(), '返回房屋')]")
            back_btn.click()
            time.sleep(1)
            
        except Exception as e:
            print(f"    保存返回失败: {str(e)}")
            # 强制返回上一页
            self.driver.back()
            time.sleep(1)
    
    def back_to_list(self):
        """返回房屋列表首页"""
        try:
            # 方法1：点击面包屑或返回按钮
            back = self.driver.find_element(By.LINK_TEXT, "返回列表")  # 或实际文字
            back.click()
        except:
            # 方法2：直接执行JS返回
            self.driver.execute_script("window.history.go(-2);")  # 返回两级
        time.sleep(1)
    
    def has_next_page(self):
        """是否有下一页（分页系统）"""
        try:
            next_btn = self.driver.find_element(By.LINK_TEXT, "下一页")
            return "disabled" not in next_btn.get_attribute("class")
        except:
            return False
    
    def go_next_page(self):
        """翻页"""
        next_btn = self.driver.find_element(By.LINK_TEXT, "下一页")
        next_btn.click()
        time.sleep(2)
    
    def has_unvisited_persons(self):
        """检查是否还有未走访人员（优化循环）"""
        # 可以通过按钮颜色、文字变化等判断
        # 如果没有"人员走访"按钮了，说明都走访完了
        btns = self.driver.find_elements(By.XPATH, "//*[contains(text(), '人员走访')]")
        return len(btns) > 0
    
    def close(self):
        self.driver.quit()

if __name__ == "__main__":
    bot = VisitAutomation()
    try:
        # 填入你的系统首页地址
        bot.start("http://你的内网地址/houseList")
    except Exception as e:
        print(f"程序异常: {e}")
        import traceback
        traceback.print_exc()
    finally:
        input("按回车关闭浏览器...")
        bot.close()

六、常见问题排查（内网必读）

1. ChromeDriver版本不匹配

错误：session not created: This version of ChromeDriver only supports Chrome version XX

解决：Chrome 49必须使用 ChromeDriver 2.22（下载地址已在上面提供）

2. 元素定位失败

由于无法看到页面，你需要给我提供：

首页房屋列表的HTML截图（F12的Elements面板）
房主页的人员列表HTML片段
走访弹出框的HTML源码

拍照或手打关键标签结构，例如：

HTML

预览

复制

<div class="grid">
  <div class="item" onclick="goHouse('001')">
    <img src="..."><span>张三户</span>
  </div>
</div>

3. 弹窗处理（老系统常见）

如果是系统弹窗（非浏览器alert），使用 pywinauto：

Python

复制

from pywinauto import Desktop
# 定位保存成功弹窗
dlg = Desktop(backend="win32").window(title_re=".*提示.*")
dlg.窗口中的按钮.click()

4. 速度控制（重要）

老系统容易卡顿，建议每步操作后加 time.sleep(1.5) 或显式等待。

七、下一步支持

请你先在外网下载所有依赖包和驱动，在内网安装后：

测试ChromeDriver能否启动：

Python

复制

from selenium import webdriver
driver = webdriver.Chrome()
driver.get("http://www.baidu.com")  # 内网可能有默认页

使用F12获取页面关键元素的XPath：
- 首页：住户姓名的标签（是<a>、<div>还是<img>？）
- 房主页："人员走访"按钮的完整XPath
- 填写页：文本框的name或id属性