it-swarm.cn

如何从粘贴的内容中过滤Microsoft Word gunk?

我有一些用户发布到群组博客并且能够剪切和粘贴,但他们的粘贴包括以下内容:

<!– /* Font Definitions */ @font-face {font-family:”Cambria Math”; panose-1:2 4 5 3 5 4 6 3 2 4; mso-font-charset:1; mso-generic-font-family:roman; mso-font-format:other; mso-font-pitch:variable; mso-font-signature:0 0 0 0 0 0;} @font-face {font-family:Calibri; panose-1:2 15 5 2 2 2 4 3 2 4; mso-font-charset:0; mso-generic-font-family:swiss; mso-font-pitch:variable; mso-font-signature:-520092929 1073786111 9 0 415 0;} @font-face {font-family:”Trebuchet MS”; panose-1:2 11 6 3 2 2 2 2 2 4; mso-font-charset:0; mso-generic-font-family:swiss; mso-font-pitch:variable; mso-font-signature:647 0 0 0 159 0;} /* Style Definitions */ p.MsoNormal, li.MsoNormal, div.MsoNormal {mso-style-unhide:no; mso-style-qformat:yes; mso-style-parent:”"; margin-top:0in; margin-right:0in; margin-bottom:10.0pt; margin-left:0in; line-height:115%; mso-pagination:Widow-Orphan; font-size:12.0pt; font-family:”Trebuchet MS”,”sans-serif”; mso-fareast-font-family:Calibri; mso-fareast-theme-font:minor-latin; mso-bidi-font-family:”Times New Roman”; mso-bidi-theme-font:minor-bidi; color:black;} p {mso-style-noshow:yes; mso-style-priority:99; mso-margin-top-alt:auto; margin-right:0in; mso-margin-bottom-alt:auto; margin-left:0in; mso-pagination:Widow-Orphan; font-size:12.0pt; font-family:”Times New Roman”,”serif”; mso-fareast-font-family:”Times New Roman”;} .MsoChpDefault {mso-style-type:export-only; mso-default-props:yes; font-size:12.0pt; mso-ansi-font-size:12.0pt; mso-bidi-font-size:12.0pt; mso-ascii-font-family:”Trebuchet MS”; mso-fareast-font-family:Calibri; mso-fareast-theme-font:minor-latin; mso-hansi-font-family:”Trebuchet MS”; mso-bidi-font-family:”Times New Roman”; mso-bidi-theme-font:minor-bidi; color:black;} .MsoPapDefault {mso-style-type:export-only; margin-bottom:10.0pt; line-height:115%;} @page WordSection1 {size:8.5in 11.0in; margin:1.0in 1.0in 1.0in 1.0in; mso-header-margin:.5in; mso-footer-margin:.5in; mso-paper-source:0;} div.WordSection1 {page:WordSection1;} –>

/* Style Definitions */
table.MsoNormalTable
{mso-style-name:”Table Normal”;
mso-tstyle-rowband-size:0;
mso-tstyle-colband-size:0;
mso-style-noshow:yes;
mso-style-priority:99;
mso-style-qformat:yes;
mso-style-parent:”";
mso-padding-alt:0in 5.4pt 0in 5.4pt;
mso-para-margin-top:0in;
mso-para-margin-right:0in;
mso-para-margin-bottom:10.0pt;
mso-para-margin-left:0in;
line-height:115%;
mso-pagination:Widow-Orphan;
font-size:11.0pt;
font-family:”Calibri”,”sans-serif”;
mso-ascii-font-family:Calibri;
mso-ascii-theme-font:minor-latin;
mso-fareast-font-family:”Times New Roman”;
mso-fareast-theme-font:minor-fareast;
mso-hansi-font-family:Calibri;
mso-hansi-theme-font:minor-latin;
mso-bidi-font-family:”Times New Roman”;
mso-bidi-theme-font:minor-bidi;}

我该怎么做才能自动过滤掉这样的代码?

3
artlung

WordPress内置的可视文本编辑器上有一个按钮,可以删除Microsoft Word格式。它被标记为“从Word粘贴” alt text

8
Chris_O

我建议使用Ozh的 TinyMCE Advanced 插件。它允许您添加“从Word粘贴”选项,为您完成所有这些操作。

但是,如果您对此不感兴趣,还可以选择其他选项。像这样:

function get_rid_of_mso_junk( $content ){
  return preg_replace( '@(mso|panose)[^:]{1,25}:[^;]+;(\s+)?(\n+)[email protected]', '', $content );
}

add_filter( 'content_save_pre', 'get_rid_of_mso_junk' );

只需继续向该正则表达式中的第一个捕获集添加不需要的声明,以添加应删除的行。例如:(mso|panose|other-junk|annoyance)

5
John P Bloch

对于任何寻找这个问题的解决方案的人,我做了这样的事情:

function delete_between($beginning, $end, $string) {
    $beginningPos = strpos($string, $beginning);
    $endPos = strpos($string, $end);
    if (!$beginningPos || !$endPos) {
    return $string;
    }

    $textToDelete = substr($string, $beginningPos, ($endPos + strlen($end)) - $beginningPos);

    return str_replace($textToDelete, '', $string);
}

function clean_content( $content ){
    if( is_home() || is_single()){
        $content = delete_between('<!--[if gte mso', ';}', $content);   
        return $content;
    }else{
    return $content;
}

add_filter( 'the_content', 'clean_content' );
add_filter( 'the_excerpt', 'clean_content' );

您可以将delete_between函数中的字符串替换为您想要的任何字符串。这似乎对我有用。

2
codeprokanner

我和那些经常遇到这个问题的客户一起工作过。我发现,技巧是复制粘贴到HTML视图中,然后切换回Visual编辑器以根据需要调整格式。

如果从其他网站复制粘贴,这也是必要的。有时您会不小心从外部源中提取类定义和内联样式,如果您的站点没有设置或支持相同的类或样式,则可能会破坏显示。

另一种选择是将您的用户暴露给 Windows Live Writer 。这是一款完全免费的微软产品,可以很好地使用Word中的复制粘贴,并且可以与WordPress进行交互 - 您可以编写帖子,编辑帖子,使用内置的拼写检查器,格式化帖子以显示确切地你想要什么,然后点击“发布”通过XMLRPC将你的帖子推送到WordPress。这是一个相当完善的系统,使得令人难以置信很容易教第一次博主如何博客......特别是因为UI与Word开始时非常相似。

2
EAMann