{"id":5352,"date":"2022-08-09T15:01:54","date_gmt":"2022-08-09T07:01:54","guid":{"rendered":"http:\/\/139.9.1.231\/?p=5352"},"modified":"2022-08-15T14:30:39","modified_gmt":"2022-08-15T06:30:39","slug":"huggingface-transformers","status":"publish","type":"post","link":"http:\/\/139.9.1.231\/index.php\/2022\/08\/09\/huggingface-transformers\/","title":{"rendered":"\ud83e\udd17 Huggingface Transformers"},"content":{"rendered":"\n<p>       Huggingface Transformers \u662f\u57fa\u4e8e\u4e00\u4e2a\u5f00\u6e90\u57fa\u4e8e transformer \u6a21\u578b\u7ed3\u6784\u63d0\u4f9b\u7684\u9884\u8bad\u7ec3\u8bed\u8a00\u5e93\uff0c\u5b83\u652f\u6301 Pytorch\uff0cTensorflow2.0\uff0c\u5e76\u4e14\u652f\u6301\u4e24\u4e2a\u6846\u67b6\u7684\u76f8\u4e92\u8f6c\u6362\u3002\u6846\u67b6\u652f\u6301\u4e86\u6700\u65b0\u7684\u5404\u79cdNLP\u9884\u8bad\u7ec3\u8bed\u8a00\u6a21\u578b\uff0c\u4f7f\u7528\u8005\u53ef\u4ee5\u5f88\u5feb\u901f\u7684\u8fdb\u884c\u6a21\u578b\u7684\u8c03\u7528\uff0c\u5e76\u4e14\u652f\u6301\u6a21\u578bfurther pretraining \u548c \u4e0b\u6e38\u4efb\u52a1fine-tuning\u3002\u00a0<\/p>\n\n\n\n\n\n<ul><li>paper:\u00a0<a rel=\"noreferrer noopener\" href=\"https:\/\/arxiv.org\/pdf\/1910.03771.pdf\" target=\"_blank\">https:\/\/arxiv.org\/pdf\/1910.03771.pdf<\/a>\u00a0\uff08EMNLP Best Demo 2020\uff09<\/li><li>github:\u00a0<a rel=\"noreferrer noopener\" href=\"https:\/\/github.com\/huggingface\/transformers\" target=\"_blank\">https:\/\/github.com\/huggingface\/transformers<\/a><\/li><li>\u5b98\u65b9\u6559\u7a0b:\u00a0<a rel=\"noreferrer noopener\" href=\"https:\/\/huggingface.co\/transformers\/\" target=\"_blank\">https:\/\/huggingface.co\/transfor<\/a>mers<\/li><\/ul>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" width=\"986\" height=\"311\" src=\"http:\/\/139.9.1.231\/wp-content\/uploads\/2022\/08\/image-61.png\" alt=\"\" class=\"wp-image-5357\" srcset=\"http:\/\/139.9.1.231\/wp-content\/uploads\/2022\/08\/image-61.png 986w, http:\/\/139.9.1.231\/wp-content\/uploads\/2022\/08\/image-61-300x95.png 300w, http:\/\/139.9.1.231\/wp-content\/uploads\/2022\/08\/image-61-768x242.png 768w\" sizes=\"(max-width: 986px) 100vw, 986px\" \/><\/figure>\n\n\n\n<p>      \u8be5\u5e93\u662f\u4f7f\u7528 BERT \u7b49\u9884\u8bad\u7ec3\u6a21\u578b\u7684\u6700\u5e38\u7528\u7684\u5e93\uff0c\u751a\u81f3\u8d85\u8fc7\u4e86google\u7b49\u5f00\u6e90\u7684\u6e90\u4ee3\u7801\u3002\u5b83\u7684\u8bbe\u8ba1\u539f\u5219\u4fdd\u8bc1\u4e86\u5b83\u652f\u6301\u5404\u79cd\u4e0d\u540c\u7684\u9884\u8bad\u7ec3\u6a21\u578b\uff0c\u5e76\u4e14\u6709\u7edf\u4e00\u7684\u5408\u7406\u7684\u89c4\u8303\u3002\u4f7f\u7528\u8005\u53ef\u4ee5\u5f88\u65b9\u4fbf\u7684\u8fdb\u884c\u6a21\u578b\u7684\u4e0b\u8f7d\uff0c\u4ee5\u53ca\u4f7f\u7528\u3002\u540c\u65f6\uff0c\u5b83\u652f\u6301\u7528\u6237\u81ea\u5df1\u4e0a\u4f20\u81ea\u5df1\u7684\u9884\u8bad\u7ec3\u6a21\u578b\u5230Model Hub\u4e2d\uff0c\u63d0\u4f9b\u5176\u4ed6\u7528\u6237\u4f7f\u7528\u3002\u5bf9\u4e8eNLP\u4ece\u4e1a\u8005\uff0c\u53ef\u4ee5\u4f7f\u7528\u8fd9\u4e2a\u5e93\uff0c\u5f88\u65b9\u4fbf\u5730\u8fdb\u884c\u81ea\u7136\u8bed\u8a00\u7406\u89e3\uff08NLU\uff09 \u548c \u81ea\u7136\u8bed\u8a00\u751f\u6210\uff08NLG\uff09\u4efb\u52a1\u7684SOTA\u6a21\u578b\u4f7f\u7528\u3002<\/p>\n\n\n\n<p>\u7279\u8272\uff1a<\/p>\n\n\n\n<ul><li>\u8d85\u7ea7&nbsp;<strong>\u7b80\u5355<\/strong>\uff0c<strong>\u5feb\u901f<\/strong>\u4e0a\u624b<\/li><li>\u9002\u5408\u4e8e\u6240\u6709\u4eba &#8211; NLP\u7814\u7a76\u5458\uff0cNLP\u5e94\u7528\u4eba\u5458\uff0c\u6559\u80b2\u5de5\u4f5c\u8005<\/li><li>NLU\/NLG SOTA \u6a21\u578b\u652f\u6301<\/li><li>\u51cf\u5c11\u9884\u8bad\u7ec3\u6210\u672c\uff0c\u63d0\u4f9b\u4e8630+\u9884\u8bad\u7ec3\u6a21\u578b\uff0c100+\u8bed\u8a00 &#8211; \u652f\u6301Pytorch \u4e0e Tensorflow2.0 \u8f6c\u6362\u3002<\/li><li>\u4ee5\u4e0b\u4e3a\u90e8\u5206\u6574\u5408\u7684\u9884\u8bad\u7ec3\u8bed\u8a00\u6a21\u578b, ref:&nbsp;<a rel=\"noreferrer noopener\" href=\"https:\/\/github.com\/huggingface\/transformers\" target=\"_blank\">Transformers Github<\/a>\uff1a<\/li><\/ul>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" width=\"1024\" height=\"620\" src=\"http:\/\/139.9.1.231\/wp-content\/uploads\/2022\/08\/image-62-1024x620.png\" alt=\"\" class=\"wp-image-5362\" srcset=\"http:\/\/139.9.1.231\/wp-content\/uploads\/2022\/08\/image-62-1024x620.png 1024w, http:\/\/139.9.1.231\/wp-content\/uploads\/2022\/08\/image-62-300x182.png 300w, http:\/\/139.9.1.231\/wp-content\/uploads\/2022\/08\/image-62-768x465.png 768w, http:\/\/139.9.1.231\/wp-content\/uploads\/2022\/08\/image-62.png 1246w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" width=\"1024\" height=\"875\" src=\"http:\/\/139.9.1.231\/wp-content\/uploads\/2022\/08\/image-63-1024x875.png\" alt=\"\" class=\"wp-image-5364\" srcset=\"http:\/\/139.9.1.231\/wp-content\/uploads\/2022\/08\/image-63-1024x875.png 1024w, http:\/\/139.9.1.231\/wp-content\/uploads\/2022\/08\/image-63-300x256.png 300w, http:\/\/139.9.1.231\/wp-content\/uploads\/2022\/08\/image-63-768x656.png 768w, http:\/\/139.9.1.231\/wp-content\/uploads\/2022\/08\/image-63.png 1161w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>\ud83e\udd17&nbsp;Transformers \u63d0\u4f9b\u4e86\u6570\u4ee5\u5343\u8ba1\u7684\u9884\u8bad\u7ec3\u6a21\u578b\uff0c\u652f\u6301 100 \u591a\u79cd\u8bed\u8a00\u7684\u6587\u672c\u5206\u7c7b\u3001\u4fe1\u606f\u62bd\u53d6\u3001\u95ee\u7b54\u3001\u6458\u8981\u3001\u7ffb\u8bd1\u3001\u6587\u672c\u751f\u6210\u3002\u5b83\u7684\u5b97\u65e8\u8ba9\u6700\u5148\u8fdb\u7684 NLP \u6280\u672f\u4eba\u4eba\u6613\u7528\u3002<\/p>\n\n\n\n<p>\ud83e\udd17&nbsp;Transformers \u63d0\u4f9b\u4e86\u4fbf\u4e8e\u5feb\u901f\u4e0b\u8f7d\u548c\u4f7f\u7528\u7684API\uff0c\u8ba9\u4f60\u53ef\u4ee5\u628a\u9884\u8bad\u7ec3\u6a21\u578b\u7528\u5728\u7ed9\u5b9a\u6587\u672c\u3001\u5728\u4f60\u7684\u6570\u636e\u96c6\u4e0a\u5fae\u8c03\u7136\u540e\u901a\u8fc7&nbsp;<a href=\"https:\/\/huggingface.co\/models\">model hub<\/a>&nbsp;\u4e0e\u793e\u533a\u5171\u4eab\u3002\u540c\u65f6\uff0c\u6bcf\u4e2a\u5b9a\u4e49\u7684 Python \u6a21\u5757\u5747\u5b8c\u5168\u72ec\u7acb\uff0c\u65b9\u4fbf\u4fee\u6539\u548c\u5feb\u901f\u7814\u7a76\u5b9e\u9a8c\u3002<\/p>\n\n\n\n<p>\ud83e\udd17&nbsp;Transformers \u652f\u6301\u4e09\u4e2a\u6700\u70ed\u95e8\u7684\u6df1\u5ea6\u5b66\u4e60\u5e93\uff1a&nbsp;<a href=\"https:\/\/jax.readthedocs.io\/en\/latest\/\">Jax<\/a>,&nbsp;<a href=\"https:\/\/pytorch.org\/\">PyTorch<\/a>&nbsp;and&nbsp;<a href=\"https:\/\/www.tensorflow.org\/\">TensorFlow<\/a>&nbsp;\u2014 \u5e76\u4e0e\u4e4b\u65e0\u7f1d\u6574\u5408\u3002\u4f60\u53ef\u4ee5\u76f4\u63a5\u4f7f\u7528\u4e00\u4e2a\u6846\u67b6\u8bad\u7ec3\u4f60\u7684\u6a21\u578b\u7136\u540e\u7528\u53e6\u4e00\u4e2a\u52a0\u8f7d\u548c\u63a8\u7406\u3002<\/p>\n\n\n\n<h2>\u5728\u7ebf\u6f14\u793a<\/h2>\n\n\n\n<p>\u4f60\u53ef\u4ee5\u76f4\u63a5\u5728\u6a21\u578b\u9875\u9762\u4e0a\u6d4b\u8bd5\u5927\u591a\u6570&nbsp;<a href=\"https:\/\/huggingface.co\/models\">model hub<\/a>&nbsp;\u4e0a\u7684\u6a21\u578b\u3002 \u6211\u4eec\u4e5f\u63d0\u4f9b\u4e86&nbsp;<a href=\"https:\/\/huggingface.co\/pricing\">\u79c1\u6709\u6a21\u578b\u6258\u7ba1\u3001\u6a21\u578b\u7248\u672c\u7ba1\u7406\u4ee5\u53ca\u63a8\u7406API<\/a>\u3002<\/p>\n\n\n\n<p>\u8fd9\u91cc\u662f\u4e00\u4e9b\u4f8b\u5b50\uff1a<\/p>\n\n\n\n<ul><li><a href=\"https:\/\/huggingface.co\/bert-base-uncased?text=Paris+is+the+%5BMASK%5D+of+France\">\u7528 BERT \u505a\u63a9\u7801\u586b\u8bcd<\/a><\/li><li><a href=\"https:\/\/huggingface.co\/dbmdz\/electra-large-discriminator-finetuned-conll03-english?text=My+name+is+Sarah+and+I+live+in+London+city\">\u7528 Electra \u505a\u547d\u540d\u5b9e\u4f53\u8bc6\u522b<\/a><\/li><li><a href=\"https:\/\/huggingface.co\/gpt2?text=A+long+time+ago%2C+\">\u7528 GPT-2 \u505a\u6587\u672c\u751f\u6210<\/a><\/li><li><a href=\"https:\/\/huggingface.co\/roberta-large-mnli?text=The+dog+was+lost.+Nobody+lost+any+animal\">\u7528 RoBERTa \u505a\u81ea\u7136\u8bed\u8a00\u63a8\u7406<\/a><\/li><li><a href=\"https:\/\/huggingface.co\/facebook\/bart-large-cnn?text=The+tower+is+324+metres+%281%2C063+ft%29+tall%2C+about+the+same+height+as+an+81-storey+building%2C+and+the+tallest+structure+in+Paris.+Its+base+is+square%2C+measuring+125+metres+%28410+ft%29+on+each+side.+During+its+construction%2C+the+Eiffel+Tower+surpassed+the+Washington+Monument+to+become+the+tallest+man-made+structure+in+the+world%2C+a+title+it+held+for+41+years+until+the+Chrysler+Building+in+New+York+City+was+finished+in+1930.+It+was+the+first+structure+to+reach+a+height+of+300+metres.+Due+to+the+addition+of+a+broadcasting+aerial+at+the+top+of+the+tower+in+1957%2C+it+is+now+taller+than+the+Chrysler+Building+by+5.2+metres+%2817+ft%29.+Excluding+transmitters%2C+the+Eiffel+Tower+is+the+second+tallest+free-standing+structure+in+France+after+the+Millau+Viaduct\">\u7528 BART \u505a\u6587\u672c\u6458\u8981<\/a><\/li><li><a href=\"https:\/\/huggingface.co\/distilbert-base-uncased-distilled-squad?text=Which+name+is+also+used+to+describe+the+Amazon+rainforest+in+English%3F&amp;context=The+Amazon+rainforest+%28Portuguese%3A+Floresta+Amaz%C3%B4nica+or+Amaz%C3%B4nia%3B+Spanish%3A+Selva+Amaz%C3%B3nica%2C+Amazon%C3%ADa+or+usually+Amazonia%3B+French%3A+For%C3%AAt+amazonienne%3B+Dutch%3A+Amazoneregenwoud%29%2C+also+known+in+English+as+Amazonia+or+the+Amazon+Jungle%2C+is+a+moist+broadleaf+forest+that+covers+most+of+the+Amazon+basin+of+South+America.+This+basin+encompasses+7%2C000%2C000+square+kilometres+%282%2C700%2C000+sq+mi%29%2C+of+which+5%2C500%2C000+square+kilometres+%282%2C100%2C000+sq+mi%29+are+covered+by+the+rainforest.+This+region+includes+territory+belonging+to+nine+nations.+The+majority+of+the+forest+is+contained+within+Brazil%2C+with+60%25+of+the+rainforest%2C+followed+by+Peru+with+13%25%2C+Colombia+with+10%25%2C+and+with+minor+amounts+in+Venezuela%2C+Ecuador%2C+Bolivia%2C+Guyana%2C+Suriname+and+French+Guiana.+States+or+departments+in+four+nations+contain+%22Amazonas%22+in+their+names.+The+Amazon+represents+over+half+of+the+planet%27s+remaining+rainforests%2C+and+comprises+the+largest+and+most+biodiverse+tract+of+tropical+rainforest+in+the+world%2C+with+an+estimated+390+billion+individual+trees+divided+into+16%2C000+species\">\u7528 DistilBERT \u505a\u95ee\u7b54<\/a><\/li><li><a href=\"https:\/\/huggingface.co\/t5-base?text=My+name+is+Wolfgang+and+I+live+in+Berlin\">\u7528 T5 \u505a\u7ffb\u8bd1<\/a><\/li><\/ul>\n\n\n\n<h2>\u5feb\u901f\u4e0a\u624b<\/h2>\n\n\n\n<p>\u6211\u4eec\u4e3a\u5feb\u901f\u4f7f\u7528\u6a21\u578b\u63d0\u4f9b\u4e86&nbsp;<code>pipeline<\/code>&nbsp;\uff08\u6d41\u6c34\u7ebf\uff09API\u3002\u6d41\u6c34\u7ebf\u805a\u5408\u4e86\u9884\u8bad\u7ec3\u6a21\u578b\u548c\u5bf9\u5e94\u7684\u6587\u672c\u9884\u5904\u7406\u3002\u4e0b\u9762\u662f\u4e00\u4e2a\u5feb\u901f\u4f7f\u7528\u6d41\u6c34\u7ebf\u53bb\u5224\u65ad\u6b63\u8d1f\u9762\u60c5\u7eea\u7684\u4f8b\u5b50\uff1a<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">&gt;&gt;&gt; from transformers import pipeline\n\n# \u4f7f\u7528\u60c5\u7eea\u5206\u6790\u6d41\u6c34\u7ebf\n&gt;&gt;&gt; classifier = pipeline('sentiment-analysis')\n&gt;&gt;&gt; classifier('We are very happy to introduce pipeline to the transformers repository.')\n[{'label': 'POSITIVE', 'score': 0.9996980428695679}]<\/pre>\n\n\n\n<p>\u7b2c\u4e8c\u884c\u4ee3\u7801\u4e0b\u8f7d\u5e76\u7f13\u5b58\u4e86\u6d41\u6c34\u7ebf\u4f7f\u7528\u7684\u9884\u8bad\u7ec3\u6a21\u578b\uff0c\u800c\u7b2c\u4e09\u884c\u4ee3\u7801\u5219\u5728\u7ed9\u5b9a\u7684\u6587\u672c\u4e0a\u8fdb\u884c\u4e86\u8bc4\u4f30\u3002\u8fd9\u91cc\u7684\u7b54\u6848\u201c\u6b63\u9762\u201d (positive) \u5177\u6709 99 \u7684\u7f6e\u4fe1\u5ea6\u3002<\/p>\n\n\n\n<p>\u8bb8\u591a\u7684 NLP \u4efb\u52a1\u90fd\u6709\u5f00\u7bb1\u5373\u7528\u7684\u9884\u8bad\u7ec3\u6d41\u6c34\u7ebf\u3002\u6bd4\u5982\u8bf4\uff0c\u6211\u4eec\u53ef\u4ee5\u8f7b\u677e\u7684\u4ece\u7ed9\u5b9a\u6587\u672c\u4e2d\u62bd\u53d6\u95ee\u9898\u7b54\u6848\uff1a<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">&gt;&gt;&gt; from transformers import pipeline\n\n# \u4f7f\u7528\u95ee\u7b54\u6d41\u6c34\u7ebf\n&gt;&gt;&gt; question_answerer = pipeline('question-answering')\n&gt;&gt;&gt; question_answerer({\n...     'question': 'What is the name of the repository ?',\n...     'context': 'Pipeline has been included in the huggingface\/transformers repository'\n... })\n{'score': 0.30970096588134766, 'start': 34, 'end': 58, 'answer': 'huggingface\/transformers'}<\/pre>\n\n\n\n<p>\u9664\u4e86\u7ed9\u51fa\u7b54\u6848\uff0c\u9884\u8bad\u7ec3\u6a21\u578b\u8fd8\u7ed9\u51fa\u4e86\u5bf9\u5e94\u7684\u7f6e\u4fe1\u5ea6\u5206\u6570\u3001\u7b54\u6848\u5728\u8bcd\u7b26\u5316 (tokenized) \u540e\u7684\u6587\u672c\u4e2d\u5f00\u59cb\u548c\u7ed3\u675f\u7684\u4f4d\u7f6e\u3002\u4f60\u53ef\u4ee5\u4ece<a href=\"https:\/\/huggingface.co\/docs\/transformers\/task_summary\">\u8fd9\u4e2a\u6559\u7a0b<\/a>\u4e86\u89e3\u66f4\u591a\u6d41\u6c34\u7ebfAPI\u652f\u6301\u7684\u4efb\u52a1\u3002<\/p>\n\n\n\n<p>\u8981\u5728\u4f60\u7684\u4efb\u52a1\u4e0a\u4e0b\u8f7d\u548c\u4f7f\u7528\u4efb\u610f\u9884\u8bad\u7ec3\u6a21\u578b\u4e5f\u5f88\u7b80\u5355\uff0c\u53ea\u9700\u4e09\u884c\u4ee3\u7801\u3002\u8fd9\u91cc\u662f PyTorch \u7248\u7684\u793a\u4f8b\uff1a<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">&gt;&gt;&gt; from transformers import AutoTokenizer, AutoModel\n\n&gt;&gt;&gt; tokenizer = AutoTokenizer.from_pretrained(\"bert-base-uncased\")\n&gt;&gt;&gt; model = AutoModel.from_pretrained(\"bert-base-uncased\")\n\n&gt;&gt;&gt; inputs = tokenizer(\"Hello world!\", return_tensors=\"pt\")\n&gt;&gt;&gt; outputs = model(**inputs)<\/pre>\n\n\n\n<p>\u8fd9\u91cc\u662f\u7b49\u6548\u7684 TensorFlow \u4ee3\u7801\uff1a<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">&gt;&gt;&gt; from transformers import AutoTokenizer, TFAutoModel\n\n&gt;&gt;&gt; tokenizer = AutoTokenizer.from_pretrained(\"bert-base-uncased\")\n&gt;&gt;&gt; model = TFAutoModel.from_pretrained(\"bert-base-uncased\")\n\n&gt;&gt;&gt; inputs = tokenizer(\"Hello world!\", return_tensors=\"tf\")\n&gt;&gt;&gt; outputs = model(**inputs)<\/pre>\n\n\n\n<p>\u8bcd\u7b26\u5316\u5668 (tokenizer) \u4e3a\u6240\u6709\u7684\u9884\u8bad\u7ec3\u6a21\u578b\u63d0\u4f9b\u4e86\u9884\u5904\u7406\uff0c\u5e76\u53ef\u4ee5\u76f4\u63a5\u5bf9\u5355\u4e2a\u5b57\u7b26\u4e32\u8fdb\u884c\u8c03\u7528\uff08\u6bd4\u5982\u4e0a\u9762\u7684\u4f8b\u5b50\uff09\u6216\u5bf9\u5217\u8868 (list) \u8c03\u7528\u3002\u5b83\u4f1a\u8f93\u51fa\u4e00\u4e2a\u4f60\u53ef\u4ee5\u5728\u4e0b\u6e38\u4ee3\u7801\u91cc\u4f7f\u7528\u6216\u76f4\u63a5\u901a\u8fc7&nbsp;<code>**<\/code>&nbsp;\u89e3\u5305\u8868\u8fbe\u5f0f\u4f20\u7ed9\u6a21\u578b\u7684\u8bcd\u5178 (dict)\u3002<\/p>\n\n\n\n<p>\u6a21\u578b\u672c\u8eab\u662f\u4e00\u4e2a\u5e38\u89c4\u7684&nbsp;<a href=\"https:\/\/pytorch.org\/docs\/stable\/nn.html#torch.nn.Module\">Pytorch&nbsp;<code>nn.Module<\/code><\/a>&nbsp;\u6216&nbsp;<a href=\"https:\/\/www.tensorflow.org\/api_docs\/python\/tf\/keras\/Model\">TensorFlow&nbsp;<code>tf.keras.Model<\/code><\/a>\uff08\u53d6\u51b3\u4e8e\u4f60\u7684\u540e\u7aef\uff09\uff0c\u53ef\u4ee5\u5e38\u89c4\u65b9\u5f0f\u4f7f\u7528\u3002&nbsp;<a href=\"https:\/\/huggingface.co\/transformers\/training.html\">\u8fd9\u4e2a\u6559\u7a0b<\/a>\u89e3\u91ca\u4e86\u5982\u4f55\u5c06\u8fd9\u6837\u7684\u6a21\u578b\u6574\u5408\u5230\u7ecf\u5178\u7684 PyTorch \u6216 TensorFlow \u8bad\u7ec3\u5faa\u73af\u4e2d\uff0c\u6216\u662f\u5982\u4f55\u4f7f\u7528\u6211\u4eec\u7684&nbsp;<code>Trainer<\/code>&nbsp;\u8bad\u7ec3\u5668\uff09API \u6765\u5728\u4e00\u4e2a\u65b0\u7684\u6570\u636e\u96c6\u4e0a\u5feb\u901f\u5fae\u8c03\u3002<\/p>\n\n\n\n<h2>\u4e3a\u4ec0\u4e48\u8981\u7528 transformers\uff1f<\/h2>\n\n\n\n<ol><li>\u4fbf\u4e8e\u4f7f\u7528\u7684\u5148\u8fdb\u6a21\u578b\uff1a<ul><li>NLU \u548c NLG \u4e0a\u8868\u73b0\u4f18\u8d8a<\/li><li>\u5bf9\u6559\u5b66\u548c\u5b9e\u8df5\u53cb\u597d\u4e14\u4f4e\u95e8\u69db<\/li><li>\u9ad8\u7ea7\u62bd\u8c61\uff0c\u53ea\u9700\u4e86\u89e3\u4e09\u4e2a\u7c7b<\/li><li>\u5bf9\u6240\u6709\u6a21\u578b\u7edf\u4e00\u7684API<\/li><\/ul><\/li><li>\u66f4\u4f4e\u8ba1\u7b97\u5f00\u9500\uff0c\u66f4\u5c11\u7684\u78b3\u6392\u653e\uff1a<ul><li>\u7814\u7a76\u4eba\u5458\u53ef\u4ee5\u5206\u4eab\u4ebf\u8bad\u7ec3\u7684\u6a21\u578b\u800c\u975e\u6b21\u6b21\u4ece\u5934\u5f00\u59cb\u8bad\u7ec3<\/li><li>\u5de5\u7a0b\u5e08\u53ef\u4ee5\u51cf\u5c11\u8ba1\u7b97\u7528\u65f6\u548c\u751f\u4ea7\u73af\u5883\u5f00\u9500<\/li><li>\u6570\u5341\u79cd\u6a21\u578b\u67b6\u6784\u3001\u4e24\u5343\u591a\u4e2a\u9884\u8bad\u7ec3\u6a21\u578b\u3001100\u591a\u79cd\u8bed\u8a00\u652f\u6301<\/li><\/ul><\/li><li>\u5bf9\u4e8e\u6a21\u578b\u751f\u547d\u5468\u671f\u7684\u6bcf\u4e00\u4e2a\u90e8\u5206\u90fd\u9762\u9762\u4ff1\u5230\uff1a<ul><li>\u8bad\u7ec3\u5148\u8fdb\u7684\u6a21\u578b\uff0c\u53ea\u9700 3 \u884c\u4ee3\u7801<\/li><li>\u6a21\u578b\u5728\u4e0d\u540c\u6df1\u5ea6\u5b66\u4e60\u6846\u67b6\u95f4\u4efb\u610f\u8f6c\u79fb\uff0c\u968f\u4f60\u5fc3\u610f<\/li><li>\u4e3a\u8bad\u7ec3\u3001\u8bc4\u4f30\u548c\u751f\u4ea7\u9009\u62e9\u6700\u9002\u5408\u7684\u6846\u67b6\uff0c\u8854\u63a5\u65e0\u7f1d<\/li><\/ul><\/li><li>\u4e3a\u4f60\u7684\u9700\u6c42\u8f7b\u677e\u5b9a\u5236\u4e13\u5c5e\u6a21\u578b\u548c\u7528\u4f8b\uff1a<ul><li>\u6211\u4eec\u4e3a\u6bcf\u79cd\u6a21\u578b\u67b6\u6784\u63d0\u4f9b\u4e86\u591a\u4e2a\u7528\u4f8b\u6765\u590d\u73b0\u539f\u8bba\u6587\u7ed3\u679c<\/li><li>\u6a21\u578b\u5185\u90e8\u7ed3\u6784\u4fdd\u6301\u900f\u660e\u4e00\u81f4<\/li><li>\u6a21\u578b\u6587\u4ef6\u53ef\u5355\u72ec\u4f7f\u7528\uff0c\u65b9\u4fbf\u9b54\u6539\u548c\u5feb\u901f\u5b9e\u9a8c<\/li><\/ul><\/li><\/ol>\n\n\n\n<h2>\u4ec0\u4e48\u60c5\u51b5\u4e0b\u6211\u4e0d\u8be5\u7528 transformers\uff1f<\/h2>\n\n\n\n<ul><li>\u672c\u5e93\u5e76\u4e0d\u662f\u6a21\u5757\u5316\u7684\u795e\u7ecf\u7f51\u7edc\u5de5\u5177\u7bb1\u3002\u6a21\u578b\u6587\u4ef6\u4e2d\u7684\u4ee3\u7801\u7279\u610f\u5448\u82e5\u749e\u7389\uff0c\u672a\u7ecf\u989d\u5916\u62bd\u8c61\u5c01\u88c5\uff0c\u4ee5\u4fbf\u7814\u7a76\u4eba\u5458\u5feb\u901f\u8fed\u4ee3\u9b54\u6539\u800c\u4e0d\u81f4\u6eba\u4e8e\u62bd\u8c61\u548c\u6587\u4ef6\u8df3\u8f6c\u4e4b\u4e2d\u3002<\/li><li><code>Trainer<\/code>&nbsp;API \u5e76\u975e\u517c\u5bb9\u4efb\u4f55\u6a21\u578b\uff0c\u53ea\u4e3a\u672c\u5e93\u4e4b\u6a21\u578b\u4f18\u5316\u3002\u82e5\u662f\u5728\u5bfb\u627e\u9002\u7528\u4e8e\u901a\u7528\u673a\u5668\u5b66\u4e60\u7684\u8bad\u7ec3\u5faa\u73af\u5b9e\u73b0\uff0c\u8bf7\u53e6\u89c5\u4ed6\u5e93\u3002<\/li><li>\u5c3d\u7ba1\u6211\u4eec\u5df2\u5c3d\u529b\u800c\u4e3a\uff0c<a href=\"https:\/\/github.com\/huggingface\/transformers\/tree\/main\/examples\">examples \u76ee\u5f55<\/a>\u4e2d\u7684\u811a\u672c\u4e5f\u4ec5\u4e3a\u7528\u4f8b\u800c\u5df2\u3002\u5bf9\u4e8e\u4f60\u7684\u7279\u5b9a\u95ee\u9898\uff0c\u5b83\u4eec\u5e76\u4e0d\u4e00\u5b9a\u5f00\u7bb1\u5373\u7528\uff0c\u53ef\u80fd\u9700\u8981\u6539\u51e0\u884c\u4ee3\u7801\u4ee5\u9002\u4e4b\u3002<\/li><\/ul>\n\n\n\n<h2>\u4e86\u89e3\u66f4\u591a<\/h2>\n\n\n\n<figure class=\"wp-block-table\"><table><thead><tr><th>\u7ae0\u8282<\/th><th>\u63cf\u8ff0<\/th><\/tr><\/thead><tbody><tr><td><a href=\"https:\/\/huggingface.co\/transformers\/\">\u6587\u6863<\/a><\/td><td>\u5b8c\u6574\u7684 API \u6587\u6863\u548c\u6559\u7a0b<\/td><\/tr><tr><td><a href=\"https:\/\/huggingface.co\/docs\/transformers\/task_summary\">\u4efb\u52a1\u603b\u7ed3<\/a><\/td><td>\ud83e\udd17&nbsp;Transformers \u652f\u6301\u7684\u4efb\u52a1<\/td><\/tr><tr><td><a href=\"https:\/\/huggingface.co\/docs\/transformers\/preprocessing\">\u9884\u5904\u7406\u6559\u7a0b<\/a><\/td><td>\u4f7f\u7528&nbsp;<code>Tokenizer<\/code>&nbsp;\u6765\u4e3a\u6a21\u578b\u51c6\u5907\u6570\u636e<\/td><\/tr><tr><td><a href=\"https:\/\/huggingface.co\/docs\/transformers\/training\">\u8bad\u7ec3\u548c\u5fae\u8c03<\/a><\/td><td>\u5728 PyTorch\/TensorFlow \u7684\u8bad\u7ec3\u5faa\u73af\u6216&nbsp;<code>Trainer<\/code>&nbsp;API \u4e2d\u4f7f\u7528&nbsp;\ud83e\udd17&nbsp;Transformers \u63d0\u4f9b\u7684\u6a21\u578b<\/td><\/tr><tr><td><a href=\"https:\/\/github.com\/huggingface\/transformers\/tree\/main\/examples\">\u5feb\u901f\u4e0a\u624b\uff1a\u5fae\u8c03\u548c\u7528\u4f8b\u811a\u672c<\/a><\/td><td>\u4e3a\u5404\u79cd\u4efb\u52a1\u63d0\u4f9b\u7684\u7528\u4f8b\u811a\u672c<\/td><\/tr><tr><td><a href=\"https:\/\/huggingface.co\/docs\/transformers\/model_sharing\">\u6a21\u578b\u5206\u4eab\u548c\u4e0a\u4f20<\/a><\/td><td>\u548c\u793e\u533a\u4e0a\u4f20\u548c\u5206\u4eab\u4f60\u5fae\u8c03\u7684\u6a21\u578b<\/td><\/tr><tr><td><a href=\"https:\/\/huggingface.co\/docs\/transformers\/migration\">\u8fc1\u79fb<\/a><\/td><td>\u4ece&nbsp;<code>pytorch-transformers<\/code>&nbsp;\u6216&nbsp;<code>pytorch-pretrained-bert<\/code>&nbsp;\u8fc1\u79fb\u5230&nbsp;\ud83e\udd17&nbsp;Transformers<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<h2>Transformers model hub<\/h2>\n\n\n\n<p><a href=\"https:\/\/huggingface.co\/models\" target=\"_blank\" rel=\"noreferrer noopener\">Transformers model hub<\/a>&nbsp;\u63d0\u4f9b\u4e86\u4e0d\u540c\u7684\u9884\u8bad\u7ec3\u8bed\u8a00\u6a21\u578b\uff0c\u5305\u542b\u4e86\u5e38\u89c1\u7684Robert\/BERT\/XLNET\/\u4ee5\u53caBART \u7b49\uff0c\u51e0\u4e4e\u6240\u6709\u7684\u6700\u65b0\u6a21\u578b\u90fd\u53ef\u4ee5\u5728\u4e0a\u9762\u627e\u5230\u3002\u7528\u6237\u53ef\u4ee5\u5f88\u65b9\u4fbf\u5730\u5bf9\u6a21\u578b\u8fdb\u884c\u8c03\u7528\uff0c\u53ea\u9700\u8981\u4e00\u4e2a\u6a21\u578b\u7684\u540d\u5b57\uff0c\u5c31\u53ef\u4ee5\u83b7\u53d6\u6a21\u578b\u6587\u4ef6\u3002<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>model = AutoModel.from_pretrained(model_name)<\/code><\/pre>\n\n\n\n<h2>\u8bbe\u8ba1\u539f\u5219 Design Principles<\/h2>\n\n\n\n<p>Transformers \u7684\u8bbe\u8ba1\u662f\u4e3a\u4e86\uff1a<\/p>\n\n\n\n<ul><li>\u7814\u7a76\u8005\u53ef\u4ee5\u8fdb\u884c\u62d3\u5c55<\/li><li>\u5355\u4e2amodeling\u7684\u6587\u4ef6\uff0c\u76f4\u63a5\u5728\u4e00\u4e2a\u6587\u4ef6\u4e2d\u5c31\u53ef\u4ee5\u4fee\u6539\u6a21\u578b\u6240\u9700\u8981\u7684\u6240\u6709\u90e8\u5206\uff0c\u6700\u5c0f\u5316\u7684\u6a21\u5757\u8bbe\u8ba1\u3002<\/li><li>\u7b97\u6cd5\u5de5\u7a0b\u5e08\u53ef\u4ee5\u8f7b\u677e\u4f7f\u7528 &#8211; \u53ef\u4ee5\u4f7f\u7528 pipeline \u76f4\u63a5\u8c03\u7528\uff0c\u83b7\u53d6\u5f00\u7bb1\u5373\u7528\u7684\u4efb\u52a1\u4f53\u9a8c\uff0c\u4f8b\u5982\u60c5\u611f\u5206\u6790\u7684\u4efb\u52a1\u7b49\u3002\u53ef\u4ee5\u4f7f\u7528trainers \u8fdb\u884c\u8bad\u7ec3\uff0c\u652f\u6301fp16\uff0c\u5206\u5e03\u5f0f\u7b49<\/li><li>\u5de5\u4e1a\u5b9e\u8df5\u4e2d\u53ef\u4ee5\u5feb\u901f\u90e8\u7f72\u4e14\u9c81\u68d2\u6027\u826f\u597d<\/li><li>CPU\/GPU\/TPU\u652f\u6301\uff0c\u53ef\u4ee5\u8fdb\u884c\u4f18\u5316\uff0c\u652f\u6301torchscript \u9759\u6001\u56fe\uff0c\u652f\u6301ONNX\u683c\u5f0f<\/li><\/ul>\n\n\n\n<h2>\u5e93\u8bbe\u8ba1 Library Design<\/h2>\n\n\n\n<p>transformers \u5e93\u5305\u542b\u4e86\u673a\u5668\u5b66\u4e60\u76f8\u5173\u7684\u4e3b\u8981\u4e09\u4e2a\u90e8\u5206\uff1a\u6570\u636e\u5904\u7406process data, \u6a21\u578b\u5e94\u7528 apply a model, \u548c\u505a\u51fa\u9884\u6d4bmake predictions\u3002\u5206\u522b\u5bf9\u5e94\u7684\u5982\u4e0b\u4e09\u4e2a\u6a21\u5757\uff1aTokenizer\uff0cTransformers\uff0c\u4ee5\u53ca Head\u3002<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-full\"><img loading=\"lazy\" width=\"725\" height=\"804\" src=\"http:\/\/139.9.1.231\/wp-content\/uploads\/2022\/08\/image-64.png\" alt=\"\" class=\"wp-image-5376\" srcset=\"http:\/\/139.9.1.231\/wp-content\/uploads\/2022\/08\/image-64.png 725w, http:\/\/139.9.1.231\/wp-content\/uploads\/2022\/08\/image-64-271x300.png 271w\" sizes=\"(max-width: 725px) 100vw, 725px\" \/><\/figure><\/div>\n\n\n\n<ul><li><strong>Tokenizers<\/strong>&nbsp;\u5206\u8bcd\u5668\uff0c\u652f\u6301\u4e0d\u540c\u7684\u5206\u8bcd\u3002\u4e3b\u8981\u4f5c\u7528\u662f\u5c06\u8f93\u5165\u8fdb\u884c\u5206\u8bcd\u5316\u540e\uff0c\u5e76\u8f6c\u5316\u4e3a\u76f8\u5e94\u6a21\u578b\u9700\u8981\u7684embedding\u3002<\/li><\/ul>\n\n\n\n<p>Tokenizer \u7c7b\u652f\u6301\u4ece\u9884\u8bad\u7ec3\u6a21\u578b\u4e2d\u8fdb\u884c\u52a0\u8f7d\u6216\u8005\u76f4\u63a5\u624b\u52a8\u914d\u7f6e\u3002\u8fd9\u4e9b\u7c7b\u5b58\u50a8\u4e86 token \u5230 id \u7684\u5b57\u5178\uff0c\u5e76\u4e14\u53ef\u4ee5\u5bf9\u8f93\u5165\u8fdb\u884c\u5206\u8bcd\uff0c\u548cdecode\u3002huggingface transformers \u5df2\u7ecf\u63d0\u4f9b\u4e86\u5982\u4e0b\u56fe\u7684\u76f8\u5173tokenizer \u5206\u8bcd\u5668\u3002\u7528\u6237\u4e5f\u53ef\u4ee5\u5f88\u8f7b\u677e\u7684\u5bf9tokenizer \u91cc\u7684\u7279\u6b8a\u5b57\u7b26\u8fdb\u884c\u66f4\u6362\uff0c\u4f8b\u5982CLS\/SEP\u3002\u6216\u8005\u662f\u5bf9Tokenizer\u6a21\u578b\u7684\u5b57\u5178\u8fdb\u884c\u5927\u5c0f\u4fee\u6539\u7b49\u3002<\/p>\n\n\n\n<p>Tokenizer \u63d0\u4f9b\u4e86\u5f88\u591a\u6709\u7528\u7684\u65b9\u6cd5\uff0c\u4f8b\u5982padding\uff0ctruncating\uff0c\u7528\u6237\u53ef\u4ee5\u5f88\u65b9\u4fbf\u7684\u5bf9\u5176\u8fdb\u884c\u4f7f\u7528\u3002<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-full\"><img loading=\"lazy\" width=\"634\" height=\"542\" src=\"http:\/\/139.9.1.231\/wp-content\/uploads\/2022\/08\/image-65.png\" alt=\"\" class=\"wp-image-5379\" srcset=\"http:\/\/139.9.1.231\/wp-content\/uploads\/2022\/08\/image-65.png 634w, http:\/\/139.9.1.231\/wp-content\/uploads\/2022\/08\/image-65-300x256.png 300w\" sizes=\"(max-width: 634px) 100vw, 634px\" \/><\/figure><\/div>\n\n\n\n<p><strong>Transformer<\/strong>&nbsp;transformers \u6307\u7684\u662f\u5404\u79cd\u57fa\u4e8etransformer\u7ed3\u6784\u7684\u9884\u8bad\u7ec3\u8bed\u8a00\u6a21\u578b\uff0c\u4f8b\u5982BERT\uff0cGPT\u7b49\u3002\u5b83\u5c06\u8f93\u5165\u7684sparse\u7684\u5e8f\u5217\uff0c\u8f6c\u5316\u4e3a\u4e0a\u4e0b\u6587\u611f\u77e5\u7684\u7684 contextual embedding\u3002<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-large\"><img loading=\"lazy\" width=\"586\" height=\"1024\" src=\"http:\/\/139.9.1.231\/wp-content\/uploads\/2022\/08\/image-67-586x1024.png\" alt=\"\" class=\"wp-image-5384\" srcset=\"http:\/\/139.9.1.231\/wp-content\/uploads\/2022\/08\/image-67-586x1024.png 586w, http:\/\/139.9.1.231\/wp-content\/uploads\/2022\/08\/image-67-172x300.png 172w, http:\/\/139.9.1.231\/wp-content\/uploads\/2022\/08\/image-67.png 616w\" sizes=\"(max-width: 586px) 100vw, 586px\" \/><\/figure><\/div>\n\n\n\n<p>encoder \u6a21\u578b\u7684\u8ba1\u7b97\u56fe\u901a\u5e38\u5c31\u662f\u5bf9\u6a21\u578b\u8f93\u5165\u8fdb\u884c\u4e00\u7cfb\u5217\u7684 self-attention \u64cd\u4f5c\uff0c\u7136\u540e\u5f97\u5230\u6700\u540e\u7684encoder\u7684\u8f93\u51fa\u3002\u901a\u5e38\u60c5\u51b5\u4e0b\uff0c\u6bcf\u4e2a\u6a21\u578b\u90fd\u662f\u5728\u4e00\u4e2a\u6587\u4ef6\u4e2d\u88ab\u5b9a\u4e49\u5b8c\u6210\u7684\uff0c\u8fd9\u6837\u65b9\u4fbf\u7528\u6237\u8fdb\u884c\u66f4\u6539\u548c\u62d3\u5c55\u3002<\/p>\n\n\n\n<p>\u9488\u5bf9\u4e0d\u540c\u7684\u6a21\u578b\u7ed3\u6784\uff0c\u90fd\u91c7\u7528\u76f8\u540c\u7684API\uff0c\u8fd9\u4f7f\u5f97\u7528\u6237\u53ef\u4ee5\u5feb\u901f\u5730\u4f7f\u7528\u4e0d\u540c\u7684\u5176\u4ed6\u6a21\u578b\u3002transformers \u63d0\u4f9b \u4e00\u7cfb\u5217\u7684Auto classes\uff0c\u4f7f\u5f97\u5feb\u901f\u8fdb\u884c\u6a21\u578b\u5207\u6362\u975e\u5e38\u65b9\u4fbf\u3002<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>model = AutoModel.from_pretrained(model_name)<\/code><\/pre>\n\n\n\n<ul><li><strong>Head<\/strong>&nbsp;\u4e0d\u540c\u4e8eattention\u7684head\uff0c\u8fd9\u8fb9\u7684 head \u6307\u7684\u662f\u4e0b\u6e38\u4efb\u52a1\u7684\u8f93\u51fa\u5c42\uff0c\u5b83\u5c06\u6a21\u578b\u7684contextual embedding \u8f6c\u5316\u4e3a\u7279\u5b9a\u4efb\u52a1\u7684\u9884\u6d4b\u503c\uff0c\u5305\u542b\u5982\u4e0b\u7684\u4e0d\u540c\u7684head\uff1a<ul><li><strong><em>Pretraining Head<\/em><\/strong><ul><li>Casual Language Modeling\uff08\u666e\u901a\u81ea\u56de\u5f52\u7684\u8bed\u8a00\u6a21\u578b\uff09\uff1aGPT\uff0c GPT-2\uff0cCTRL<\/li><li>Masked Language Modeling\uff08\u63a9\u7801\u8bed\u8a00\u6a21\u578b\uff09\uff1aBERT\uff0c RoBERTa<\/li><li>Permuted Language Modeling\uff08\u4e71\u5e8f\u91cd\u6392\u8bed\u8a00\u6a21\u578b\uff09\uff1aXLNet<\/li><\/ul><\/li><li><strong><em>Fine-tuning Head<\/em><\/strong><ul><li>Language Modeling\uff1a\u8bed\u8a00\u6a21\u578b\u8bad\u7ec3\uff0c\u9884\u6d4b\u4e0b\u4e00\u4e2a\u8bcd\u3002\u4e3b\u8981\u7528\u4e8e\u6587\u672c\u751f\u6210<\/li><li>Sequence Classification\uff1a\u6587\u672c\u5206\u7c7b\u4efb\u52a1\uff0c\u60c5\u611f\u5206\u6790\u4efb\u52a1<\/li><li>Question Answering\uff1a\u673a\u5668\u9605\u8bfb\u7406\u89e3\u4efb\u52a1\uff0cQA<\/li><li>Token Classification\uff1atoken\u7ea7\u522b\u7684\u5206\u7c7b\uff0c\u4e3b\u8981\u7528\u4e8e\u547d\u540d\u5b9e\u4f53\u8bc6\u522b\uff08NER\uff09\u4efb\u52a1\uff0c\u53e5\u6cd5\u89e3\u6790Tagging\u4efb\u52a1<\/li><li>Multiple Choice\uff1a\u591a\u9009\u4efb\u52a1\uff0c\u4e3b\u8981\u662f\u6587\u672c\u9009\u62e9\u4efb\u52a1<\/li><li>Masked LM\uff1a\u63a9\u7801\u9884\u6d4b\uff0c\u968f\u673amask\u4e00\u4e2atoken\uff0c\u9884\u6d4b\u8be5 token \u662f\u4ec0\u4e48\u8bcd\uff0c\u7528\u4e8e\u9884\u8bad\u7ec3<\/li><li>Conditional Generation\uff1a\u6761\u4ef6\u751f\u6210\u4efb\u52a1\uff0c\u4e3b\u8981\u7528\u4e8e\u7ffb\u8bd1\u4ee5\u53ca\u6458\u8981\u4efb\u52a1\u3002<\/li><\/ul><\/li><\/ul><\/li><\/ul>\n\n\n\n<p>\u8fd9\u4e9b\u6a21\u578b\u7684head\uff0c\u662f\u5728\u6a21\u578b\u6587\u4ef6\u96c6\u4e2d\u4e0a\uff0c\u5305\u88c5\u7684\u53e6\u5916\u4e00\u4e2a\u7c7b\uff0c\u5b83\u63d0\u4f9b\u4e86\u989d\u5916\u7684\u8f93\u51fa\u5c42\uff0closs\u51fd\u6570\u7b49\u3002 \u8fd9\u4e9b\u5c42\u7684\u547d\u540d\u89c4\u8303\u4e5f\u5f88\u4e00\u81f4\uff0c\u91c7\u7528\u7684\u662f\uff1a XXXForSequenceClassification<\/p>\n\n\n\n<p>\u5176\u4e2d XXX \u662f\u6a21\u578b\u7684\u4e0b\u6e38\u4efb\u52a1(\ufb01ne-tuning) \u6216\u8005\u4e0e\u8bad\u7ec3 pretraining \u4efb\u52a1\u3002\u4e00\u4e9bhead\uff0c\u4f8b\u5982\u6761\u4ef6\u751f\u6210\uff08conditional generation\uff09\uff0c\u652f\u6301\u989d\u5916\u7684\u529f\u80fd\uff0c\u50cf\u662fsampling and beam search\u3002<\/p>\n\n\n\n<p>\u4e0b\u56fe\u89e3\u91ca\u4e86\u6bcf\u4e2ahead \u7684\u8f93\u5165\u548c\u8f93\u51fa\u4ee5\u53ca\u6570\u636e\u96c6\u3002<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" width=\"761\" height=\"292\" src=\"http:\/\/139.9.1.231\/wp-content\/uploads\/2022\/08\/image-66.png\" alt=\"\" class=\"wp-image-5380\" srcset=\"http:\/\/139.9.1.231\/wp-content\/uploads\/2022\/08\/image-66.png 761w, http:\/\/139.9.1.231\/wp-content\/uploads\/2022\/08\/image-66-300x115.png 300w\" sizes=\"(max-width: 761px) 100vw, 761px\" \/><\/figure>\n\n\n\n<p>\u4e0b\u9762\u7684\u4ee3\u7801\u5c55\u793a\u4e86\u5982\u4f55\u4f7f\u7528 transformers \u8fdb\u884c\u4e0b\u6e38\u7684\u6587\u672c\u5206\u7c7b\u4efb\u52a1\uff1a<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>from transformers import AutoModelForSequenceClassification\n\nmodel = AutoModelForSequenceClassification.from_pretrained(checkpoint, num_labels=2)<\/code><\/pre>\n\n\n\n<h1>Huggingface Transformer \u4f7f\u7528\u65b9\u6cd5\uff08\u6559\u7a0b\uff09<\/h1>\n\n\n\n<p><strong>Transformers<\/strong>\u63d0\u4f9b\u4e86\u6570\u4ee5\u5343\u8ba1\u9488\u5bf9\u4e8e\u5404\u79cd\u4efb\u52a1\u7684\u9884\u8bad\u7ec3\u6a21\u578b\u6a21\u578b\uff0c\u5f00\u53d1\u8005\u53ef\u4ee5\u6839\u636e\u81ea\u8eab\u7684\u9700\u8981\uff0c\u9009\u62e9\u6a21\u578b\u8fdb\u884c\u8bad\u7ec3\u6216\u5fae\u8c03\uff0c\u4e5f\u53ef\u9605\u8bfbapi\u6587\u6863\u548c\u6e90\u7801\uff0c \u5feb\u901f\u5f00\u53d1\u65b0\u6a21\u578b\u3002<\/p>\n\n\n\n<h2 id=\"h_448852278_0\">0\u3001<a rel=\"noreferrer noopener\" href=\"https:\/\/huggingface.co\/course\/chapter0\/1?fw=pt\" target=\"_blank\">Setup<\/a><\/h2>\n\n\n\n<p>1\uff09\u5b89\u88c5\u4e00\u4e2a\u975e\u5e38\u8f7b\u91cf\u7ea7\u7684 Transformers<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>!pip install transformers<\/code><\/pre>\n\n\n\n<p>\u7136\u540e<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>import transformers<\/code><\/pre>\n\n\n\n<p>2\uff09\u5efa\u8bae\u5b89\u88c5\u5f00\u53d1\u7248\u672c\uff0c\u51e0\u4e4e\u5e26\u6709\u6240\u6709\u7528\u4f8b\u9700\u8981\u7684\u4f9d\u8d56\u9879<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>!pip install transformers&#091;sentencepiece]<\/code><\/pre>\n\n\n\n<h2 id=\"h_448852278_1\">\u4e00\u3001\u6a21\u578b\u7b80\u4ecb&nbsp;<a href=\"https:\/\/huggingface.co\/course\/chapter1\/1?fw=pt\" target=\"_blank\" rel=\"noreferrer noopener\">Transformer models<\/a><\/h2>\n\n\n\n<h3 id=\"h_448852278_2\">1. pipelines \u7b80\u5355\u7684\u5c0f\u4f8b\u5b50<\/h3>\n\n\n\n<blockquote class=\"wp-block-quote\"><p>Transformers \u5e93\u4e2d\u6700\u57fa\u672c\u7684\u5bf9\u8c61\u662f<code>pipeline()<\/code>\u51fd\u6570\u3002\u5b83\u5c06\u6a21\u578b\u4e0e\u5176\u5fc5\u8981\u7684\u9884\u5904\u7406\u548c\u540e\u5904\u7406\u6b65\u9aa4\u8fde\u63a5\u8d77\u6765\uff0c\u4f7f\u6211\u4eec\u80fd\u591f<strong>\u76f4\u63a5\u8f93\u5165\u4efb\u4f55\u6587\u672c\u5e76\u83b7\u5f97\u7b54\u6848<\/strong>\uff1a<\/p><\/blockquote>\n\n\n\n<p>\u5f53\u7b2c\u4e00\u6b21\u8fd0\u884c\u7684\u65f6\u5019\uff0c\u5b83\u4f1a\u4e0b\u8f7d\u9884\u8bad\u7ec3\u6a21\u578b\u548c\u5206\u8bcd\u5668(tokenizer)\u5e76\u4e14\u7f13\u5b58\u4e0b\u6765\u3002<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>from transformers import pipeline\n\nclassifier = pipeline(\"sentiment-analysis\")  <em># \u60c5\u611f\u5206\u6790<\/em>\nclassifier(\"I've been waiting for a HuggingFace course my whole life.\")\n\n<em># \u8f93\u51fa<\/em>\n<em># &#091;{'label': 'POSITIVE', 'score': 0.9598047137260437}]<\/em><\/code><\/pre>\n\n\n\n<p>\u4e5f\u53ef\u4ee5\u4f20\u51e0\u53e5\u8bdd\uff1a<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>classifier(\n    &#091;\"I've been waiting for a HuggingFace course my whole life.\", \"I hate this so much!\"]\n)\n\n<em># \u8f93\u51fa<\/em>\n'''\n&#091;{'label': 'POSITIVE', 'score': 0.9598047137260437},\n {'label': 'NEGATIVE', 'score': 0.9994558095932007}]\n'''<\/code><\/pre>\n\n\n\n<p>\u76ee\u524d<a href=\"https:\/\/huggingface.co\/transformers\/main_classes\/pipelines.html\" target=\"_blank\" rel=\"noreferrer noopener\">\u53ef\u7528\u7684<\/a>\u4e00\u4e9bpipeline \u6709\uff1a<\/p>\n\n\n\n<blockquote class=\"wp-block-quote\"><p><code>feature-extraction<\/code>&nbsp;\u7279\u5f81\u63d0\u53d6\uff1a\u628a\u4e00\u6bb5\u6587\u5b57\u7528\u4e00\u4e2a\u5411\u91cf\u6765\u8868\u793a<br><code>fill-mask<\/code>&nbsp;\u586b\u8bcd\uff1a\u628a\u4e00\u6bb5\u6587\u5b57\u7684\u67d0\u4e9b\u90e8\u5206mask\u4f4f\uff0c\u7136\u540e\u8ba9\u6a21\u578b\u586b\u7a7a<br><code>ner<\/code>&nbsp;\u547d\u540d\u5b9e\u4f53\u8bc6\u522b\uff1a\u8bc6\u522b\u6587\u5b57\u4e2d\u51fa\u73b0\u7684\u4eba\u540d\u5730\u540d\u7684\u547d\u540d\u5b9e\u4f53<br><code>question-answering<\/code>&nbsp;\u95ee\u7b54\uff1a\u7ed9\u5b9a\u4e00\u6bb5\u6587\u672c\u4ee5\u53ca\u9488\u5bf9\u5b83\u7684\u4e00\u4e2a\u95ee\u9898\uff0c\u4ece\u6587\u672c\u4e2d\u62bd\u53d6\u7b54\u6848<br><code>sentiment-analysis<\/code>&nbsp;\u60c5\u611f\u5206\u6790\uff1a\u4e00\u6bb5\u6587\u672c\u662f\u6b63\u9762\u8fd8\u662f\u8d1f\u9762\u7684\u60c5\u611f\u503e\u5411<br><code>summarization<\/code>&nbsp;\u6458\u8981\uff1a\u6839\u636e\u4e00\u6bb5\u957f\u6587\u672c\u4e2d\u751f\u6210\u7b80\u77ed\u7684\u6458\u8981<br><code>text-generation<\/code>\u6587\u672c\u751f\u6210\uff1a\u7ed9\u5b9a\u4e00\u6bb5\u6587\u672c\uff0c\u8ba9\u6a21\u578b\u8865\u5145\u540e\u9762\u7684\u5185\u5bb9<br><code>translation<\/code>&nbsp;\u7ffb\u8bd1\uff1a\u628a\u4e00\u79cd\u8bed\u8a00\u7684\u6587\u5b57\u7ffb\u8bd1\u6210\u53e6\u4e00\u79cd\u8bed\u8a00<br><code>zero-shot-classification<\/code><\/p><\/blockquote>\n\n\n\n<p>\u8fd9\u4e9bpipeline\u7684\u5177\u4f53\u4f8b\u5b50\u53ef\u89c1\uff1a<a rel=\"noreferrer noopener\" href=\"https:\/\/huggingface.co\/course\/chapter1\/3?fw=pt\" target=\"_blank\">Transformer models &#8211; Hugging Face Course<\/a><\/p>\n\n\n\n<h2>2. \u5404\u79cd\u4efb\u52a1\u7684\u4ee3\u8868\u6a21\u578b<\/h2>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-full\"><img loading=\"lazy\" width=\"947\" height=\"731\" src=\"http:\/\/139.9.1.231\/wp-content\/uploads\/2022\/08\/image-212.png\" alt=\"\" class=\"wp-image-5853\" srcset=\"http:\/\/139.9.1.231\/wp-content\/uploads\/2022\/08\/image-212.png 947w, http:\/\/139.9.1.231\/wp-content\/uploads\/2022\/08\/image-212-300x232.png 300w, http:\/\/139.9.1.231\/wp-content\/uploads\/2022\/08\/image-212-768x593.png 768w\" sizes=\"(max-width: 947px) 100vw, 947px\" \/><\/figure><\/div>\n\n\n\n<h2 id=\"h_448852278_4\">\u4e8c\u3001 \u4f7f\u7528&nbsp;<a href=\"https:\/\/huggingface.co\/course\/chapter2\/1?fw=pt\" target=\"_blank\" rel=\"noreferrer noopener\">Using Transformers<\/a><\/h2>\n\n\n\n<h3 id=\"h_448852278_5\">1. Pipeline \u80cc\u540e\u7684\u6d41\u7a0b<\/h3>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-large\"><img loading=\"lazy\" width=\"1024\" height=\"326\" src=\"http:\/\/139.9.1.231\/wp-content\/uploads\/2022\/08\/e7a9569f-561f-4917-b564-f68533416e3a-1024x326.png\" alt=\"\" class=\"wp-image-5855\" srcset=\"http:\/\/139.9.1.231\/wp-content\/uploads\/2022\/08\/e7a9569f-561f-4917-b564-f68533416e3a-1024x326.png 1024w, http:\/\/139.9.1.231\/wp-content\/uploads\/2022\/08\/e7a9569f-561f-4917-b564-f68533416e3a-300x95.png 300w, http:\/\/139.9.1.231\/wp-content\/uploads\/2022\/08\/e7a9569f-561f-4917-b564-f68533416e3a-768x244.png 768w, http:\/\/139.9.1.231\/wp-content\/uploads\/2022\/08\/e7a9569f-561f-4917-b564-f68533416e3a-1536x488.png 1536w, http:\/\/139.9.1.231\/wp-content\/uploads\/2022\/08\/e7a9569f-561f-4917-b564-f68533416e3a.png 1783w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><figcaption>Pipeline \u80cc\u540e\u7684\u6d41\u7a0b<\/figcaption><\/figure><\/div>\n\n\n\n<p>\u5728\u63a5\u6536\u6587\u672c\u540e\uff0c\u901a\u5e38\u6709\u4e09\u6b65\uff1aTokenizer\u3001Model\u3001Post-Processing\u3002<\/p>\n\n\n\n<p><strong>1\uff09Tokenizer<\/strong><\/p>\n\n\n\n<p>\u4e0e\u5176\u4ed6\u795e\u7ecf\u7f51\u7edc\u4e00\u6837\uff0cTransformer \u6a21\u578b\u4e0d\u80fd\u76f4\u63a5\u5904\u7406\u539f\u59cb\u6587\u672c\uff0c\u6545\u4f7f\u7528\u5206\u8bcd\u5668\u8fdb\u884c\u9884\u5904\u7406\u3002\u4f7f\u7528<code>AutoTokenizer<\/code>\u7c7b\u53ca\u5176<code>from_pretrained()<\/code>\u65b9\u6cd5\u3002<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>from transformers import AutoTokenizer\n\ncheckpoint = \"distilbert-base-uncased-finetuned-sst-2-english\"\ntokenizer = AutoTokenizer.from_pretrained(checkpoint)<\/code><\/pre>\n\n\n\n<p>\u82e5\u8981\u6307\u5b9a\u6211\u4eec\u60f3\u8981\u8fd4\u56de\u7684\u5f20\u91cf\u7c7b\u578b\uff08PyTorch\u3001TensorFlow \u6216\u666e\u901a NumPy\uff09\uff0c\u6211\u4eec\u4f7f\u7528<code>return_tensors<\/code>\u53c2\u6570<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>raw_inputs = &#091;\n    \"I've been waiting for a HuggingFace course my whole life.\",\n    \"I hate this so much!\",\n]\ninputs = tokenizer(raw_inputs, padding=True, truncation=True, return_tensors=\"pt\")\nprint(inputs)<\/code><\/pre>\n\n\n\n<p>PyTorch \u5f20\u91cf\u7684\u7ed3\u679c\uff1a<\/p>\n\n\n\n<p>\u8f93\u51fa\u672c\u8eab\u662f\u4e00\u4e2a\u5305\u542b\u4e24\u4e2a\u952e\u7684\u5b57\u5178\uff0c<code>input_ids<\/code>\u548c<code>attention_mask<\/code>\u3002<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>{\n    'input_ids': tensor(&#091;\n        &#091;  101,  1045,  1005,  2310,  2042,  3403,  2005,  1037, 17662, 12172, 2607,  2026,  2878,  2166,  1012,   102],\n        &#091;  101,  1045,  5223,  2023,  2061,  2172,   999,   102,     0,     0,     0,     0,     0,     0,     0,     0]\n    ]), \n    'attention_mask': tensor(&#091;\n        &#091;1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],\n        &#091;1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0]\n    ])\n}<\/code><\/pre>\n\n\n\n<p><strong>2\uff09Model<\/strong><\/p>\n\n\n\n<p>Transformers \u63d0\u4f9b\u4e86\u4e00\u4e2a<code>AutoModel<\/code>\u7c7b\uff0c\u5b83\u4e5f\u6709\u4e00\u4e2a<code>from_pretrained()<\/code>\u65b9\u6cd5\uff1a<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>from transformers import AutoModel\n\ncheckpoint = \"distilbert-base-uncased-finetuned-sst-2-english\"\nmodel = AutoModel.from_pretrained(checkpoint)<\/code><\/pre>\n\n\n\n<p>\u5982\u679c\u6211\u4eec\u5c06\u9884\u5904\u7406\u8fc7\u7684\u8f93\u5165\u63d0\u4f9b\u7ed9\u6211\u4eec\u7684\u6a21\u578b\uff0c\u6211\u4eec\u53ef\u4ee5\u770b\u5230\uff1a<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>outputs = model(**inputs)\nprint(outputs.last_hidden_state.shape)\n\n<em># \u8f93\u51fa <\/em>\n<em># torch.Size(&#091;2, 16, 768])<\/em><\/code><\/pre>\n\n\n\n<figure class=\"wp-block-image\"><img src=\"https:\/\/pic3.zhimg.com\/v2-f51f9dae359ec191229b35028d0897ca_r.jpg\" alt=\"preview\"\/><\/figure>\n\n\n\n<p>Transformers \u4e2d\u6709\u8bb8\u591a\u4e0d\u540c\u7684\u67b6\u6784\u53ef\u7528\uff0c\u6bcf\u4e00\u79cd\u67b6\u6784\u90fd\u56f4\u7ed5\u7740\u5904\u7406\u7279\u5b9a\u4efb\u52a1\u800c\u8bbe\u8ba1\uff0c\u6e05\u5355\uff1a<\/p>\n\n\n\n<blockquote class=\"wp-block-quote\"><p><code>*Model<\/code>&nbsp;(retrieve the hidden states)<br><code>*ForCausalLM<\/code><br><code>*ForMaskedLM<\/code><br><code>*ForMultipleChoice<\/code><br><code>*ForQuestionAnswering<\/code><br><code>*ForSequenceClassification<\/code><br><code>*ForTokenClassification<\/code><br>and others<\/p><\/blockquote>\n\n\n\n<p><strong>3\uff09Post-Processing<\/strong><\/p>\n\n\n\n<p>\u6a21\u578b\u6700\u540e\u4e00\u5c42\u8f93\u51fa\u7684\u539f\u59cb<strong>\u975e\u6807\u51c6\u5316\u5206\u6570<\/strong>\u3002\u8981\u8f6c\u6362\u4e3a\u6982\u7387\uff0c\u5b83\u4eec\u9700\u8981\u7ecf\u8fc7\u4e00\u4e2a<a href=\"https:\/\/en.wikipedia.org\/wiki\/Softmax_function\" target=\"_blank\" rel=\"noreferrer noopener\">SoftMax<\/a>\u5c42\uff08\u6240\u6709 Transformers \u6a21\u578b\u90fd\u8f93\u51fa logits\uff0c\u56e0\u4e3a\u7528\u4e8e\u8bad\u7ec3\u7684\u635f\u8017\u51fd\u6570\u4e00\u822c\u4f1a\u5c06\u6700\u540e\u7684\u6fc0\u6d3b\u51fd\u6570(\u5982SoftMax)\u4e0e\u5b9e\u9645\u635f\u8017\u51fd\u6570(\u5982\u4ea4\u53c9\u71b5)\u878d\u5408 \u3002<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>import torch\n\npredictions = torch.nn.functional.softmax(outputs.logits, dim=-1)\nprint(predictions)<\/code><\/pre>\n\n\n\n<h3 id=\"h_448852278_6\">2.&nbsp;<a href=\"https:\/\/huggingface.co\/course\/chapter2\/3?fw=pt\" target=\"_blank\" rel=\"noreferrer noopener\">Models<\/a><\/h3>\n\n\n\n<p><strong><u>1\uff09\u521b\u5efa<\/u>Transformer<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>from transformers import BertConfig, BertModel\n\n<em># Building the config<\/em>\nconfig = BertConfig()\n\n<em># Building the model from the config<\/em>\nmodel = BertModel(config)<\/code><\/pre>\n\n\n\n<p><strong>2\uff09\u4e0d\u540c\u7684\u52a0\u8f7d\u65b9\u5f0f<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>from transformers import BertModel\n\nmodel = BertModel.from_pretrained(\"bert-base-cased\")<\/code><\/pre>\n\n\n\n<p><strong>3\uff09\u4fdd\u5b58\u6a21\u578b<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>model.save_pretrained(\"directory_on_my_computer\")<\/code><\/pre>\n\n\n\n<p><strong>4\uff09\u4f7f\u7528Transformer model<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>sequences = &#091;\"Hello!\", \"Cool.\", \"Nice!\"]\nencoded_sequences = &#091;\n    &#091;101, 7592, 999, 102],\n    &#091;101, 4658, 1012, 102],\n    &#091;101, 3835, 999, 102],\n]\n\nimport torch\n\nmodel_inputs = torch.tensor(encoded_sequences)<\/code><\/pre>\n\n\n\n<h3 id=\"h_448852278_7\"><strong>3.&nbsp;<a href=\"https:\/\/huggingface.co\/course\/chapter2\/4?fw=pt\" target=\"_blank\" rel=\"noreferrer noopener\">Tokenizers<\/a><\/strong><\/h3>\n\n\n\n<p><strong>1\uff09Loading and saving<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>from transformers import BertTokenizer\n\ntokenizer = BertTokenizer.from_pretrained(\"bert-base-cased\")\ntokenizer(\"Using a Transformer network is simple\")\n\n<em># \u8f93\u51fa<\/em>\n'''\n{'input_ids': &#091;101, 7993, 170, 11303, 1200, 2443, 1110, 3014, 102],\n 'token_type_ids': &#091;0, 0, 0, 0, 0, 0, 0, 0, 0],\n 'attention_mask': &#091;1, 1, 1, 1, 1, 1, 1, 1, 1]}\n'''\n\n<em># \u4fdd\u5b58<\/em>\ntokenizer.save_pretrained(\"directory_on_my_computer\")<\/code><\/pre>\n\n\n\n<p><strong>2\uff09Tokenization<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>from transformers import AutoTokenizer\n\ntokenizer = AutoTokenizer.from_pretrained(\"bert-base-cased\")\n\nsequence = \"Using a Transformer network is simple\"\ntokens = tokenizer.tokenize(sequence)\n\nprint(tokens) <em># \u8f93\u51fa : &#091;'Using', 'a', 'transform', '##er', 'network', 'is', 'simple']<\/em>\n\n<em>#  \u4ecetoken \u5230\u8f93\u5165 ID<\/em>\nids = tokenizer.convert_tokens_to_ids(tokens)\nprint(ids) <em># \u8f93\u51fa\uff1a&#091;7993, 170, 11303, 1200, 2443, 1110, 3014]<\/em><\/code><\/pre>\n\n\n\n<p><strong>3\uff09 Decoding<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>decoded_string = tokenizer.decode(&#091;7993, 170, 11303, 1200, 2443, 1110, 3014])\nprint(decoded_string) <em># \u8f93\u51fa\uff1a'Using a Transformer network is simple'<\/em><\/code><\/pre>\n\n\n\n<h3 id=\"h_448852278_8\">4.&nbsp;<a href=\"https:\/\/huggingface.co\/course\/chapter2\/5?fw=pt\" target=\"_blank\" rel=\"noreferrer noopener\">\u5904\u7406\u591a\u4e2a\u5e8f\u5217<\/a>&nbsp;<a href=\"https:\/\/huggingface.co\/course\/chapter2\/5?fw=pt\" target=\"_blank\" rel=\"noreferrer noopener\">Handling multiple sequences<\/a><\/h3>\n\n\n\n<p><strong><u>1)&nbsp;<\/u>\u6a21\u578b\u9700\u8981\u4e00\u6279\u8f93\u5165 Models expect a batch of inputs<\/strong><\/p>\n\n\n\n<p>\u5c06\u6570\u5b57\u5217\u8868\u8f6c\u6362\u4e3a\u5f20\u91cf\u5e76\u5c06\u5176\u53d1\u9001\u5230\u6a21\u578b\uff1a<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>import torch\nfrom transformers import AutoTokenizer, AutoModelForSequenceClassification\n\ncheckpoint = \"distilbert-base-uncased-finetuned-sst-2-english\"\ntokenizer = AutoTokenizer.from_pretrained(checkpoint)\nmodel = AutoModelForSequenceClassification.from_pretrained(checkpoint)\n\nsequence = \"I've been waiting for a HuggingFace course my whole life.\"\n\ntokens = tokenizer.tokenize(sequence)\nids = tokenizer.convert_tokens_to_ids(tokens)\n\ninput_ids = torch.tensor(&#091;ids])\nprint(\"Input IDs:\", input_ids)\n\noutput = model(input_ids)\nprint(\"Logits:\", output.logits)\n\n<em># \u8f93\u51fa<\/em>\n'''\nInput IDs: &#091;&#091; 1045,  1005,  2310,  2042,  3403,  2005,  1037, 17662, 12172,  2607, 2026,  2878,  2166,  1012]]\nLogits: &#091;&#091;-2.7276,  2.8789]]\n'''<\/code><\/pre>\n\n\n\n<p><strong>2) \u586b\u5145\u8f93\u5165 Padding the inputs<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>model = AutoModelForSequenceClassification.from_pretrained(checkpoint)\n\nsequence1_ids = &#091;&#091;200, 200, 200]]\nsequence2_ids = &#091;&#091;200, 200]]\nbatched_ids = &#091;\n    &#091;200, 200, 200],\n    &#091;200, 200, tokenizer.pad_token_id],\n]\n\nprint(model(torch.tensor(sequence1_ids)).logits)\nprint(model(torch.tensor(sequence2_ids)).logits)\nprint(model(torch.tensor(batched_ids)).logits)\n\n<em># \u8f93\u51fa<\/em>\n'''\ntensor(&#091;&#091; 1.5694, -1.3895]], grad_fn=&lt;AddmmBackward&gt;)\ntensor(&#091;&#091; 0.5803, -0.4125]], grad_fn=&lt;AddmmBackward&gt;)\ntensor(&#091;&#091; 1.5694, -1.3895],\n        &#091; 1.3373, -1.2163]], grad_fn=&lt;AddmmBackward&gt;)\n'''<\/code><\/pre>\n\n\n\n<h3 id=\"h_448852278_9\">5. \u603b\u7ed3&nbsp;<a href=\"https:\/\/huggingface.co\/course\/chapter2\/6?fw=pt\" target=\"_blank\" rel=\"noreferrer noopener\">Putting it all together<\/a><\/h3>\n\n\n\n<p>\u6211\u4eec\u5df2\u7ecf\u63a2\u7d22\u4e86\u5206\u8bcd\u5668\u7684\u5de5\u4f5c\u539f\u7406\uff0c\u5e76\u7814\u7a76\u4e86\u5206\u8bcd tokenizers\u3001\u8f6c\u6362\u4e3a\u8f93\u5165 ID conversion to input IDs\u3001\u586b\u5145 padding\u3001\u622a\u65ad truncation\u548c\u6ce8\u610f\u529b\u63a9\u7801 attention masks\u3002Transformers API \u53ef\u4ee5\u901a\u8fc7\u9ad8\u7ea7\u51fd\u6570\u4e3a\u6211\u4eec\u5904\u7406\u6240\u6709\u8fd9\u4e9b\u3002<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>from transformers import AutoTokenizer\n\ncheckpoint = \"distilbert-base-uncased-finetuned-sst-2-english\"\ntokenizer = AutoTokenizer.from_pretrained(checkpoint)\n\nsequence = \"I've been waiting for a HuggingFace course my whole life.\"\n\nmodel_inputs = tokenizer(sequence)<\/code><\/pre>\n\n\n\n<pre class=\"wp-block-code\"><code><em># \u53ef\u4ee5\u6807\u8bb0\u5355\u4e2a\u5e8f\u5217<\/em>\nsequence = \"I've been waiting for a HuggingFace course my whole life.\"\nmodel_inputs = tokenizer(sequence)\n\n<em># \u8fd8\u53ef\u4ee5\u4e00\u6b21\u5904\u7406\u591a\u4e2a\u5e8f\u5217<\/em>\nsequences = &#091;\"I've been waiting for a HuggingFace course my whole life.\", \"So have I!\"]\nmodel_inputs = tokenizer(sequences)<\/code><\/pre>\n\n\n\n<pre class=\"wp-block-code\"><code><em># \u53ef\u4ee5\u6839\u636e\u51e0\u4e2a\u76ee\u6807\u8fdb\u884c\u586b\u5145<\/em>\n<em># Will pad the sequences up to the maximum sequence length<\/em>\nmodel_inputs = tokenizer(sequences, padding=\"longest\")\n\n<em># Will pad the sequences up to the model max length<\/em>\n<em># (512 for BERT or DistilBERT)<\/em>\nmodel_inputs = tokenizer(sequences, padding=\"max_length\")\n\n<em># Will pad the sequences up to the specified max length<\/em>\nmodel_inputs = tokenizer(sequences, padding=\"max_length\", max_length=8)<\/code><\/pre>\n\n\n\n<pre class=\"wp-block-code\"><code><em># \u8fd8\u53ef\u4ee5\u622a\u65ad\u5e8f\u5217<\/em>\nsequences = &#091;\"I've been waiting for a HuggingFace course my whole life.\", \"So have I!\"]\n\n<em># Will truncate the sequences that are longer than the model max length<\/em>\n<em># (512 for BERT or DistilBERT)<\/em>\nmodel_inputs = tokenizer(sequences, truncation=True)\n\n<em># Will truncate the sequences that are longer than the specified max length<\/em>\nmodel_inputs = tokenizer(sequences, max_length=8, truncation=True)<\/code><\/pre>\n\n\n\n<pre class=\"wp-block-code\"><code><em># \u53ef\u4ee5\u5904\u7406\u5230\u7279\u5b9a\u6846\u67b6\u5f20\u91cf\u7684\u8f6c\u6362\uff0c\u7136\u540e\u53ef\u4ee5\u5c06\u5176\u76f4\u63a5\u53d1\u9001\u5230\u6a21\u578b\u3002<\/em>\nsequences = &#091;\"I've been waiting for a HuggingFace course my whole life.\", \"So have I!\"]\n\n<em># Returns PyTorch tensors<\/em>\nmodel_inputs = tokenizer(sequences, padding=True, return_tensors=\"pt\")\n\n<em># Returns TensorFlow tensors<\/em>\nmodel_inputs = tokenizer(sequences, padding=True, return_tensors=\"tf\")\n\n<em># Returns NumPy arrays<\/em>\nmodel_inputs = tokenizer(sequences, padding=True, return_tensors=\"np\")<\/code><\/pre>\n\n\n\n<p><strong>Special tokens<\/strong><\/p>\n\n\n\n<blockquote class=\"wp-block-quote\"><p>\u5206\u8bcd\u5668\u5728\u5f00\u5934\u6dfb\u52a0\u7279\u6b8a\u8bcd[CLS]\uff0c\u5728\u7ed3\u5c3e\u6dfb\u52a0\u7279\u6b8a\u8bcd[SEP]\u3002<\/p><\/blockquote>\n\n\n\n<pre class=\"wp-block-code\"><code>sequence = \"I've been waiting for a HuggingFace course my whole life.\"\n\nmodel_inputs = tokenizer(sequence)\nprint(model_inputs&#091;\"input_ids\"])\n\ntokens = tokenizer.tokenize(sequence)\nids = tokenizer.convert_tokens_to_ids(tokens)\nprint(ids)\n\n<em># \u8f93\u51fa<\/em>\n'''\n&#091;101, 1045, 1005, 2310, 2042, 3403, 2005, 1037, 17662, 12172, 2607, 2026, 2878, 2166, 1012, 102]\n&#091;1045, 1005, 2310, 2042, 3403, 2005, 1037, 17662, 12172, 2607, 2026, 2878, 2166, 1012]\n'''\n\nprint(tokenizer.decode(model_inputs&#091;\"input_ids\"]))\nprint(tokenizer.decode(ids))\n\n<em># \u8f93\u51fa<\/em>\n'''\n\"&#091;CLS] i've been waiting for a huggingface course my whole life. &#091;SEP]\"\n\"i've been waiting for a huggingface course my whole life.\"\n'''<\/code><\/pre>\n\n\n\n<pre class=\"wp-block-code\"><code><em># \u603b\u7ed3\uff1a\u4ece\u5206\u8bcd\u5668\u5230\u6a21\u578b<\/em>\nimport torch\nfrom transformers import AutoTokenizer, AutoModelForSequenceClassification\n\ncheckpoint = \"distilbert-base-uncased-finetuned-sst-2-english\"\ntokenizer = AutoTokenizer.from_pretrained(checkpoint)\nmodel = AutoModelForSequenceClassification.from_pretrained(checkpoint)\nsequences = &#091;\"I've been waiting for a HuggingFace course my whole life.\", \"So have I!\"]\n\ntokens = tokenizer(sequences, padding=True, truncation=True, return_tensors=\"pt\")\noutput = model(**tokens)<\/code><\/pre>\n\n\n\n<p>Huggingface Transformers\u5e93\u5b66\u4e60\u7b14\u8bb0\uff08\u4e8c\uff09\uff1a\u4f7f\u7528Transformers(\u4e0a)\uff08Using Transformers Part 1\uff09\uff1a <a href=\"https:\/\/blog.csdn.net\/u011426236\/article\/details\/115460564\" target=\"_blank\" rel=\"noreferrer noopener\">https:\/\/blog.csdn.net\/u011426236\/article\/details\/115460564<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Huggingface Transformers \u662f\u57fa\u4e8e\u4e00\u4e2a\u5f00\u6e90\u57fa\u4e8e transformer \u6a21\u578b\u7ed3\u6784\u63d0\u4f9b\u7684\u9884 &hellip; <a href=\"http:\/\/139.9.1.231\/index.php\/2022\/08\/09\/huggingface-transformers\/\" class=\"more-link\">\u7ee7\u7eed\u9605\u8bfb<span class=\"screen-reader-text\">\ud83e\udd17 Huggingface Transformers<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[4,12],"tags":[],"_links":{"self":[{"href":"http:\/\/139.9.1.231\/index.php\/wp-json\/wp\/v2\/posts\/5352"}],"collection":[{"href":"http:\/\/139.9.1.231\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/139.9.1.231\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/139.9.1.231\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/139.9.1.231\/index.php\/wp-json\/wp\/v2\/comments?post=5352"}],"version-history":[{"count":39,"href":"http:\/\/139.9.1.231\/index.php\/wp-json\/wp\/v2\/posts\/5352\/revisions"}],"predecessor-version":[{"id":5864,"href":"http:\/\/139.9.1.231\/index.php\/wp-json\/wp\/v2\/posts\/5352\/revisions\/5864"}],"wp:attachment":[{"href":"http:\/\/139.9.1.231\/index.php\/wp-json\/wp\/v2\/media?parent=5352"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/139.9.1.231\/index.php\/wp-json\/wp\/v2\/categories?post=5352"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/139.9.1.231\/index.php\/wp-json\/wp\/v2\/tags?post=5352"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}