最近在做一个PHP读取word文档功能,搜索一圈后决定选择用phpword第三方组件。
composer安装phpWord
composer require phpoffice/phpword
如果你的文件是doc格式,直接另存为一个docx就行了;如果你的doc文档较多,可以下一个批量转换工具:http://www.batchwork.com/en/doc2doc/download.htm
关键点
-
对齐方式:PhpOffice\PhpWord\Style\Paragraph -> getAlignment()
-
字体名称:\PhpOffice\PhpWord\Style\Font -> getName()
-
字体大小:\PhpOffice\PhpWord\Style\Font -> getSize()
-
是否加粗:\PhpOffice\PhpWord\Style\Font -> isBold()
-
读取图片:\PhpOffice\PhpWord\Element\Image -> getImageStringData()
-
ba64格式图片数据保存为图片:file_put_contents(imageSrc,base64decode(imageSrc, base64_decode(imageSrc,base64decode(imageData))
完整代码
require './vendor/autoload.php';
function docx2html($source)
{
$phpWord = \PhpOffice\PhpWord\IOFactory::load($source);
$html = '';
foreach ($phpWord->getSections() as $section) {
foreach ($section->getElements() as $ele1) {
$paragraphStyle = $ele1->getParagraphStyle();
if ($paragraphStyle) {
$html .= '<p style="text-align:'. $paragraphStyle->getAlignment() .';text-indent:20px;">'</